Gesture Recognition with SensiML
- Objective
- Materials
- Procedure
- Flashing the Gesture Classifier Demo Firmware
- Gesture Classifier Firmware Overview
- Data Collection Overview
- Data Collection: Sensor Configuration
- Data Collection: Collection Protocol
- Data Collection: Post-Processing
- Data Collection: Data Capture Tools
- Data Import with Data Capture Lab
- Model Development
- Knowledge Pack Integration
- Final Remarks
Objective
This tutorial will guide you through the process of building a gesture classifier with SensiML and deploying it to the Microchip SAMD21 Machine Learning Evaluation Kit. We'll also provide some guidance on what factors you should consider when designing your data collection process and solutions to common issues you may encounter in developing your application. A fully developed gesture classifier project including a dataset, pre-trained machine learning model, and firmware source code to help you get up and running quickly with your project using SensiML and the SAMD21 ML Evaluation Kit are provided with this guide.
Materials
Hardware Tools
- SAMD21 Machine Learning Evaluation Kit with BOSCH IMU or with TDK IMU
Software Tools
- MPLAB® X IDE
- SensiML Analytics Studio and Data Capture Lab (Free account)
Exercise Files
- The firmware and MPLAB X project files can be found in the GitHub repository.
- The dataset used in this tutorial can be downloaded from the latest GitHub release.
- Pre-built firmware files for gesture recognition and data collection can be downloaded from the latest GitHub release.
Procedure
Before we get started, you'll need to install and set up the required software as detailed in the steps below.
Flashing the Gesture Classifier Demo Firmware
We are now set up to run the pre-built firmware. Go ahead and program your device with the firmware HEX file from the latest GitHub release using the following steps.
Open the ml-samd21-iot-sensiml-gestures-demo.zip archive downloaded previously and locate the gesture classifier demo HEX file corresponding to your sensor make:
- Bosch IMU: binaries/samd21-iot-sensiml-gestures-demo_bmi160.hex
- TDK IMU: binaries/samd21-iot-sensiml-gestures-demo_icm42688.hex
Gesture Classifier Firmware Overview
For a description of the demo firmware included with this project including operation, usage, and benchmarks see the "README" section in the GitHub repository.
Data Collection Overview
Before we jump into collecting data samples, we should put some consideration into the design of our data collection process; after all, the data that we collect will ultimately determine the kind of performance we can expect to achieve with our Machine Learning model.
Data Collection: Sensor Configuration
The first step in the data collection process is to determine the best sensor configuration for your application; this includes both the physical placement and installation of the sensor as well as signal processing parameters like sample rate and sensitivity.
Most likely, many of your design parameters for sensor configuration are fixed (due to e.g. a fixed board design, shared sensor usage, etc.), but it is worth considering whether the application design is optimal for your machine learning task and if some design parameters should be changed. The question you should be asking at this point in the design is this: can I reasonably expect an algorithm to predict the desired output given the sensor data input? Data exploration (e.g. visualization) will help here to generate good initial hypotheses, as well as a good working knowledge of the signal domain (i.e. understanding the physical processes at work).
Here are a few specific questions we might ask during the sensor configuration stage and some possible answers:
- How should the sensor sampling parameters be configured? (i.e., sample rate, sensitivity/input range, etc.)
- Choose a sensor configuration that captures the events of interest in a reasonably compact representation, with a good signal-to-interference ratio.
- How should the sensor be placed? (i.e., mounting and orientation)
- Choose a placement that will minimize the susceptibility to interference (such as vibrations from an engine).
- How should the sensor be fixed?
- Choose a method that will ensure consistency between readings over time and across different sensor deployments.
The main sensor configuration parameters chosen for this project and the justification behind their choices are as follows:
- Accelerometer only
- Chosen gestures should be mostly invariant to the device rotations
- 100 Hz Sample Rate
- Chosen gestures have a frequency range of typically < 5 Hz (i.e., 10 Hz Nyquist rate), but 100 Hz was chosen for flexibility in the data collection process
- 16 G accelerometer range
- Least sensitive setting since we're not interested in micro-movements
Data Collection: Collection Protocol
The next step in the data collection process is putting together a protocol to use when collecting your data.
Roughly speaking, we want to achieve three things with the protocol:
A reproducible methodology for performing data collection
A reproducible methodology ensures that the data collection process is performed in a prescribed manner, with minimal variations between measurements, and ensures the integrity of our data.
Sampling parameters that will ensure we have a sufficient number of samples for development, and enough diversity (i.e., coverage) to enable our end model to generalize well
A good rule of thumb is that you need at least tens of samples for each class of event you want to classify (30 is a good starting point); however, this number may increase depending on the variance between the samples. Taking the gestures application as an example, if you wanted to detect a circle gesture, but wanted your model to be invariant to the size or speed of the circle gesture, you would need many more samples to cover the range of performances.
Another thing to consider when selecting a sample size is that you will invariably capture noise (i.e., unintended variances) in your samples; the hope is that with enough samples, the training algorithm will have enough information to learn to discriminate between the signal of interest and the noise.
A word to the wise: start small! Anticipate that the development of your data collection process will require some iteration; refine your process first, then start scaling up.
A set of metadata variables to be captured during the collection process that can be used to explain the known variances between samples
Metadata variables (or tags) are the breadcrumbs you leave yourself to trace your data samples once they're joined into a larger sample pool; among other things, these tags can be used to explore subgroups within your data (e.g., all gestures performed by a single test subject) and to track down any data issues you might uncover later (e.g., hardware problems, outlier samples, etc.).
For this demo project, we created a data protocol document that specified how gestures should be performed and what metadata should be collected along with it. To illustrate, below are the directives that constrained how the test subject would perform the gestures for collection. The text in italics defines the fixed experimental parameters for which we explicitly control.
- Subject should perform gestures that follow the specified trajectory description (e.g., clockwise wheel)
- Subject should perform gestures smoothly, in a way that feels natural to them
- Subject should perform gesture continuously for at least ten seconds
- Subject should be standing
- Subject should use dominant hand
- Subject should hold the board with a thumb and forefinger grip with the cord facing down as shown in Figure 2.
In addition, the following metadata values were logged for each data collection.
- Date of capture
- SAMD21 Test board ID
- Test environment ID
- Test subject ID
- (For idle class data only) Placement and orientation of SAMD21 board
Data Collection: Post-Processing
Finally, all data samples were post-processed to form the final dataset.
- Data was split into exactly ten-second samples
- Samples were formatted as CSV files with the following naming convention:
<class>-<participant-id>-<extra-metadata>-<collection-date-yymmdd>-<integer-sample-number>.csv
- Samples were split into folds with 80% being allocated to development and 20% to testing
- Split was stratified so that the proportion of samples per class and per subject ID was the same for the development and test sets.
Data Collection: Data Capture Tools
For this guide, we'll be using the pre-built dataset included with the gestures demo, but to build your dataset you can use the MPLAB X MPLAB Data Visualizer and Machine Learning plugins. These plugins can be used in tandem to capture samples and export them as a CSV or DCLI file that can be easily imported into SensiML's Data Capture Lab.
To use the ML Evaluation Kit with MPLAB Data Visualizer, you'll need to use the data logger firmware maintained on the "SAMD21 ML Evaluation Kit Data Logger" page. For convenience, pre-built binaries for the sensor configuration used in this project have been packaged in the ml-samd21-iot-sensiml-gestures-demo.zip archive included in the latest release:
- Bosch IMU: binaries/samd21-iot-data-visualizer_bmi160_100hz-axayzgxgygz-16g-2000dps.hex
- TDK IMU: binaries/samd21-iot-data-visualizer_icm42688_100hz-axayzgxgygz-16g-2000dps.hex
Refer to the "Using the ML Partners Plugin with SensiML" guide for more information on the data capture process.
Data Import with Data Capture Lab
Let's move on to importing our data into a new SensiML project.
With the newly created project opened, navigate to the File menu and click the Import from DCLI… item as shown in Figure 3.
When you reach the Select a Device Plugin dialog, click the SAMD21 ML Eval Kit item as shown in Figure 4 and click Next.
After selecting the device plugin, the Plugin Details page will appear; click Next to move forward to the Sensor Properties page. On the properties page, fill out the fields to match the configuration shown in Figure 5 (or select the ICM sensor if you are using the TDK IMU), then click Next.
Finally, give a name to the sensor configuration in the Save Sensor Configuration window. As shown in Figure 6, we've simply selected the name BMI160.
Repeat steps three and four to import the test samples (dataset/test/test.dcli); this is the data that will be used to validate the model. When prompted, use the same sensor configuration that we created in the previous step.
At this point, our project is set up with the data we need and we can move on to the model development stage.
Model Development
Let's now move into the Analytics Studio to generate our classifier model.
Navigate to the Home tab to view your projects and open the project you created in the previous section.
Navigate over to the Prepare Data tab to create the query that will be used to train your machine learning model. Fill out the fields as shown in Figure 9; these query parameters will select only the samples in the training fold, and only use the accelerometer axes.
The SensiML Query determines what data from our dataset will be selected for training. We can use this to exclude samples (e.g., our test samples) or exclude data axes (e.g., gyrometer axes).
Switch over to the Build Model tab to start developing the machine learning model. Fill out the fields as shown in Figure 10. Note that the only settings that need to be changed from their defaults are the Query (created in the last step), the Optimization Metric (f1-score), and the Window Size (200 samples).
Due to the imbalance in the gesture dataset's class distribution, choosing the accuracy optimization metric here would bias the model optimization towards the classes with more samples; hence we opt for the f1-score to provide a better representative measure of model performance.
We choose a Window Size of 200 (i.e., two seconds at the 100 Hz IMU sample rate) here since that will be long enough to cover at least one cycle of the gestures we're interested in.
Once you've entered the pipeline settings, click the Optimize button. This step will use AutoML techniques to automatically select the best features and machine learning algorithm for the gesture classification task given your input data. This process will usually take several minutes.
More detailed information about the AutoML configuration parameters can be found on the AutoML documentation page.
Once the Build Model optimization step is completed, navigate to the Test Model tab.
Click Compute Summary to generate the confusion matrix for the test samples. This should take a few minutes; once completed you will be presented with a table like is shown in Figure 12 summarizing the classification results.
Finally, navigate to the Download Model tab to deploy your model. Fill out the Knowledge Pack settings using the Pipeline, Model, and Data Source you created in the previous steps, and select the Library output format (see Figure 13 for reference) then click the Download button.
You now have a compiled library for the SAMD21 containing your machine learning model that you can integrate into your project. For more detailed information on the Analytics Studio, head over to SensiML's documentation page.
Knowledge Pack Integration
Let's take our SensiML library (i.e., knowledge pack) and integrate it into an existing MPLAB X project using the gestures demo project as a template.
Scroll down to where the class_map variable is defined (see Figure 14 for reference). Modify the class_map strings to match up with the class mapping that was displayed in the Download Model step of the Analytics Studio. Note that the "UNK" class (integer 0) is reserved by SensiML, so this mapping won't change.
Scroll down a bit further down inside the main while loop until you reach the section as shown in Figure 15 that begins with a call to buffer_get_read_buffer. This is the fundamental essence of the code: it calls into the SensiML knowledge pack via the kb_run_model function for every sample we get from the IMU, and calls kb_reset_model whenever an inference was successfully made.
Make modifications to the LED code here to reflect your class mapping.
Okay, you are now ready to compile. Go ahead and click the Make and Program Device button in the toolbar ( icon) to compile and flash your firmware to the SAMD21 MCU.
Final Remarks
That's it! You should now have a basic understanding of developing a sound recognition application with Edge Impulse and Microchip hardware.
For more details about integrating your Impulse with an existing MPLAB X project, check out our "Integrating the Edge Impulse Inferencing SDK" article.
To learn more about Edge Impulse Studio, including tutorials for other machine learning applications, go to the Edge Impulse Docs Getting Started page.