Overcoming the Challenges of Adding Machine Learning to Your Products
Overview of Artificial Intelligence and Machine Learning in Embedded Applications
Get Started | Capturing the Data |
What is Artificial Intelligence, Machine Learning, and Deep Learning?
Let's examine the concepts of Artificial Intelligence (AI) and Machine Learning (ML). What do these terms mean? The field of AI has been around for a while. Since the 1950s, and even earlier with Alan Turing's pioneering work, AI has been defined as any non-human program or model capable of performing complex tasks or creating the perception of doing so. Turing's model and the Turing Test, for instance, involve a scenario where if one cannot distinguish between the responses of a human and a machine, the machine is considered to have passed the test. This does not imply the machine knows everything, but rather that it can convincingly perform sophisticated tasks.
Moving on to ML, this subset of AI enables computers to learn from data. By analyzing data and applying statistical methods and mathematical transformations, computers can derive rules or extract meaning from the data without being explicitly programmed for specific tasks. Unlike traditional linear programming, where data is processed through a predefined sequence of steps, machine learning allows the computer to develop its own rules to achieve the desired outcomes.
Lastly, we have the concept of Deep Learning (DL), a subset of machine learning, which involves using neural networks with multiple layers. DL enables the identification of higher-level features within data. For example, when analyzing an image, neural networks can detect edges, colors, and shapes, and from these elements, determine more complex features such as a nose, an eye, or a face. Although our graphic highlights 2010 as a significant year for deep learning, research in neural networks dates back to the 1980s, demonstrating the long-standing interest and development in this area.
Why is Machine Learning so Popular Now?
Machine learning has existed for a long time, but why is it suddenly gaining so much attention?
Access to Fast Computational Power in the Cloud
Cloud platforms provide scalable and cost-efficient resources, allowing large datasets to be processed and complex models to be trained without significant upfront investment in hardware. This accessibility extends to small businesses and individual developers, making advanced computational resources widely available. Cloud environments also facilitate collaboration among geographically dispersed teams and offer integrated machine learning tools and services, simplifying the development, training, and deployment of models. Additionally, specialized hardware like Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs) available in the cloud significantly improves training and inference process times. These factors combined have made cloud computing a powerful enabler for the widespread adoption and growth of machine learning technologies.
Access to Large Datasets
Additionally, there is now better availability of large datasets. One of the challenges in embedded ML is sourcing data. For larger problems like character recognition or vision, you need extensive datasets with numerous examples. These large datasets are becoming more accessible.
Access to Machine Learning Algorithms
There has also been significant progress in the algorithms themselves—not just the models, but the algorithms used to generate the models and how those models learn. These model-generation mechanisms and tools (e.g., MPLAB® Machine Learning Development Suite) are becoming more accessible, enabling the algorithms to run on microcontrollers (MCUs), microprocessors (MPUs), and Field Programmable Gate Arrays (FPGAs).
Why Machine Learning at the Edge?
Well, it turns out that the computational power required to design ML models is quite high. You need a significant amount of this power to design them, though not necessarily to run them.
Machine Learning has created the potential for new applications or capabilities at the edge (i.e., the edge of a communications network). If you can identify the correct algorithms and deploy them on the appropriate devices, you can achieve real-time inference with low latency. This means that with the right model, you can process and analyze data locally (i.e., the edge) without the need to send it through network switches or the cloud, which can be crucial for applications requiring real-time interaction and autonomy.
Keeping data local is also advantageous for privacy reasons. For instance, you may not want information about your front doorbell and occupancy to be posted on the internet. However, you would still like to receive notifications about package deliveries. By processing data locally, you can maintain privacy while still receiving important alerts.
Reducing power consumption is another significant benefit. This can be achieved by eliminating high-speed real-time communication links between the edge and the cloud. For example, devices like Amazon's Alexa remain in a low-power state until woken up with a keyword. After wakeup, command processing may take place in the cloud, further reducing power consumption at the edge.
Consider the example of an Arlo® security system, which only activates and sends images when it detects someone carrying a package. This reduces power consumption and extends the battery life of the camera from one month to eight or ten months.
Machine Learning Edge Applications
At the high end, you have smart embedded vision for quality control in factories, enhanced medical diagnostics, camera security, and surveillance.
Smart Human-Machine Interface (HMI) applications could involve local processing of voice commands, where you speak a wake-up word followed by simple commands like "on," "off," "left," "right," "up," "down," or "stop."
Another smart HMI application is interactive gesture recognition, where machine learning is used for snoop detection (i.e., detecting multiple faces looking at your monitor). A camera monitors the user interface and locks it if someone is looking over your shoulder.
Smart predictive maintenance applications don't require much performance, and can even be accomplished using an 8-bit MCU with attached sensors. These can be attached to electric motors or bearings to detect aging or degradation. These sensors can send warnings indicating that maintenance is needed, preventing potential failures.
ML Edge Applications | Requirements | Input Sensors | Silicon |
---|---|---|---|
Smart Embedded Vision |
|
|
|
Smart Human Machine Interface (HMI) |
|
|
|
Smart Predictive Maintenance |
|
|
|
Steps for Developing a Machine Learning Application
Define Problem
Data Preprocessing
Feature Selection
Model Selection
Model Training
Having selected your model, you still do not have something that you can actually run. You now need to train your model using the data obtained from the organizing data step. You need to tune its parameters.
At this point, it is useful to think about a Proportional-Integral-Derivative (PID) controller, even though it is not an AI ML example. You are likely familiar with a PID controller, which fundamentally involves taking a proportional signal, adding it to an integral signal, and then adding that to a differential signal. This is the model for controlling a motor or another system. You take some P, add it to some I, and add it to some D. The challenge lies in determining how much P, I, and D to use. This is where you need to train the model and determine the appropriate coefficients. This process is known as the training step of the model.
Model Evaluation
You need to evaluate the model to determine its accuracy and how well it matches the incoming data. It is important to note that you cannot simply check it against the data used for training, as a correctly trained model will show 100% accuracy on that data. Therefore, you need to test it against a separate dataset and assess the quality of that data.
After evaluating your model, you may find that it is not optimal. In such cases, you might need to consider a different architecture for your model. This could mean returning to the model selection step, as the evaluation may indicate the need for retraining.
Deployment
Monitoring and Maintenance
One step that is often overlooked, is the monitoring and maintenance of your model. Some models use supervised learning. In this approach, you set the system running, it learns, and then you finalize and lock it, ensuring the model remains unchanged. However, there may be scenarios where you require a different type of system that continues to learn over time. For instance, consider a machine tool that wears down over several years; its failure modes, noise, and signals may change, necessitating adaptation.
When transitioning to the deployment step, you may discover that the 64-bit double floating point numbers used during model training and evaluation are not practical for deployment. They may be too slow, too large, or consume too much memory. In such cases, you might opt for a less optimal model that is more compatible with the hardware, ensuring it functions effectively.
Where Does Each Step Occur?
All of these different steps occur in various locations. You can observe that the MCU step at the beginning is where you collect the data. Typically, you will obtain the data from the same place where you will execute it at the end. This might not be the case if you are dealing with more sophisticated sets of data, such as images.
The tasks in the middle, such as organizing and analyzing the data, will be performed in a cloud or a data center. This is where Microchip's MPLAB Machine Learning Development Suite comes into play.