Show page

Differences

This shows you the differences between two versions of the page.

--- iothings:laboratoare:2025:lab6 [2025/10/28 19:55]
dan.tudose [3. Logging training data]
+++ iothings:laboratoare:2025:lab6 [2025/11/03 17:49] (current)
dan.tudose [Project setup and dataset capture]
@@ Line 1: / Line 1: @@
-====== Lab 6. Edge Machine Learning ======
+====== Lab 6. Supervised Learning: TinyML ======
 ===== What AI/ML + IoT actually is =====
@@ Line 138: / Line 138: @@
   * Keep the board still → LED should be off (prob_shake < 0.5).
   * Shake the board → LED should go green (prob_shake > 0.5).
+===== Second example: Wake word detection =====
+This is a more complex example in which we will use Sparrow's on-board microphone to detect a simple wake word. The user will speak "hello" and the board will signal the correct detection by flashing the neopixel LED.
+The pipeline is similar to the previous example, we will first need to acquire a large number of audio samples from the microphone with the wake word and with background or other words and then train the NN on the captured dataset.
+===== Project setup and dataset capture =====
+Import this project into Platformio:
+{{:iothings:laboratoare:2025:aispeak.zip|AISpeak.zip}}
+It should have the structure below:
+<code>
+include/
+src/
+  |-audio_capture.cpp
+scripts/
+  |-data/
+    |-hello/
+    |-other/
+  capture_host.py
+  trainer_improved.py
+platformio.ini
+</code>
+Flash your board with audio_capture.cpp and then quit VSCode.
+From your terminal, run ''capture_host.py'' and press ''h'' to record a wake word or ''o'' to record ambient sound or anything else.
+<note important>You must speak the wake word immediately after pressing ''h'', as you have only a one second window!</note>
+<note tip>Do at least 50 'hello' word samples and at least 50-100 'other' samples.</note>
+===== Training the model =====
+After you have sufficient samples to build your dataset, you can train your model. Do this by running the second python script, ''trainer_improved.py''.
+Just like in the previous example, it will generate a audio_model.h file which you will need to include into your project.
+<code>
+python3 trainer_improved.py
+X shape: (140, 980) expected input: 980
+Test accuracy: 0.9642857142857143
+checksums: mu -58395.66015625 sig^-1 111.23896789550781 W0 809.772705078125 b0 1.0469250679016113 W1 4.047879695892334 b1 0.33552950620651245
+Wrote include/audio_model.h
+</code>
+===== Running the wake word example =====
+Replace your main project file in Platformio with [[iothings:laboratoare:2025_code:lab6_7|this one]] and flash it to your board. When you will speak the wake word, the light will flash blue.
+For example, this is how a successful wake word detection looks like:
+<code>
+probHello=0.0000 (gated by RMS)
+rms=0.00016
+probHello=0.0000 (gated by RMS)
+rms=0.00373
+feat[min=-74.45 max=-0.42 mean=-63.49]
+# featRaw[0..7]: -42.81 -62.15 -62.71 -67.39 -64.80 -67.98 -68.99 -70.97
+# featNorm[0..7]: -0.20 -0.21 -0.04 -0.27 0.14 -0.12 -0.10 -0.35
+logits[off=0.261 on=-1.694] prob=0.1239 avg=0.1239
+rms=0.00521
+feat[min=-74.45 max=-0.42 mean=-61.81]
+logits[off=0.944 on=-3.833] prob=0.0083 avg=0.0661
+rms=0.00665
+feat[min=-74.45 max=-0.42 mean=-60.29]
+logits[off=0.360 on=-3.811] prob=0.0152 avg=0.0492
+rms=0.00692
+feat[min=-74.45 max=-0.42 mean=-59.17]
+logits[off=-0.390 on=-4.269] prob=0.0203 avg=0.0146
+rms=0.00696
+</code>