Untouchable LIMiTS

Introduction

Looper Instrument & Metronome inside Theremin Synthesizer is a musical instrument that combines the functionality of a synthesizer, an optical-sensor-based theremin, and a loop station, all processed on an ESP32-S3 microcontroller.

  • What it does: It generates real-time audio waveforms (sine, square, triangle, sawtooth) that the user can modulate. The pitch (frequency) is controlled either “in the air” by moving a hand over a ToF laser sensor, or via a DIY capacitive touch keyboard. The device allows recording a multi-bar audio sequence and playing it back in a continuous loop, offering the ability to play “over” your own recording (overdubbing).
  • The starting idea: The desire to build an electronic instrument that does not rely on expensive physical keyboards, while integrating a “loop pedal” directly into the internal memory to be able to have a jam session with yourself.

General description

The project is structured around the ESP32-S3-WROOM-1 (N16R8) microcontroller, which runs the audio synthesis on one processor core and the user interface/metronome on the other core, ensuring zero-latency performance.

System Architecture (Main Blocks):

  • Input & Navigation Module: An I2C OLED display and an EC11 rotary encoder form the graphical user interface for selecting waveforms and ADSR envelope settings.
  • Sensory Module (The Instrument):
    • A VL53L1X Time-of-Flight sensor communicates via I2C to measure hand distance with millimeter precision, mapping the data to a musical scale.
    • A capacitive keyboard (aluminum foil strips) connects directly to the ESP32's internal hardware touch pins.
  • Looping: Tactile push buttons trigger audio recording directly into the 8MB PSRAM.
  • Metronome: An independent active buzzer with volume and bpm configurable. Being separate from the sound generating loop it won't interfere with the recording.
  • Audio Output Module (The “Hacky” DAC): The digital signal (PDM/PWM) generated by the ESP32 passes through a 1st-order hardware Low-Pass RC filter for smoothing. It is then amplified by an LM386 module and routed directly to a pair of wired headphones via a modified 3.5mm jack splitter.

Hardware Design

Bill of Materials (BoM):

  • 1x ESP32-S3-DevKitC-1 (N16R8): The main microcontroller (16MB Flash, 8MB PSRAM for the audio buffer).
  • 1x VL53L1X Time-of-Flight Sensor: For the Theremin functionality (I2C).
  • 1x 0.96” OLED Display: For the system menu (I2C).
  • 1x EC11 Rotary Encoder: With built-in push button, for menu navigation.
  • 1x LM386 Audio Amplifier Module: To amplify the filtered signal for headphones.
  • 3x 10k Linear Potentiometers (B10K): For dynamic volume control and LFO/filter adjustments.
  • 1x Passive Buzzer: For the integrated dual-core metronome.
  • 4x Tactile Push Buttons (Momentary): 12x12mm, for loop triggering (CH1 / CH2).
  • Passive Components (RC Filter & Decoupling):
    • 1x 104 Ceramic Capacitor (0.1µF)
    • 1x 475 Ceramic Capacitor (4.7µF) - For power supply stabilization on the breadboard.
    • Resistors (220Ω - 330Ω for the audio filter, 1kΩ for the buzzer).
  • 1x 3.5mm Audio Jack Splitter Cable: to connect headphones to the breadboard without destroying the original headphone plug.
  • DIY Materials: Aluminum foil and cardboard for the capacitive keyboard.
  • Miscellaneous: Breadboards (minimum 2 tied together), Dupont jumper wires (M-M, M-F)

Block Schema

Electronic schema

Pictures

Pinout & Hardware Mapping

To ensure stability and prevent conflicts with the ESP32-S3's internal memory and boot processes, several pins were explicitly avoided:

  • Reserved for Octal PSRAM / Flash: GPIO 30-39 (FSPICLK, FSPICS0, FSPID0-7).
  • Reserved for Native USB: GPIO 19-20 (D-/D+).
  • Boot/Strapping Pins Avoided: GPIO 0, 3, 45, 46.
Pin Constant / Name GPIO Pin Component / Module Notes & Justification
PIN_I2C_SDA 15 I2C Bus Data (OLED + VL53L1X) SAFE — Not used by Octal PSRAM. Shares the bus for display and sensor.
PIN_I2C_SCL 16 I2C Bus Clock (OLED + VL53L1X) SAFE — Not used by Octal PSRAM.
PIN_ENC_A 17 EC11 Rotary Encoder Standard digital input.
PIN_ENC_B 18 EC11 Rotary Encoder Standard digital input.
PIN_ENC_PUSH 21 EC11 Rotary Encoder Push button for menu selection.
PIN_ENC_KEY 47 EC11 Rotary Encoder Additional encoder key/switch input.
KEY_PINS 1, 2, 4, 5, 6, 7, 8, 9, 10, 11 Touch Keyboard (10 Keys) Uses native S3 touch-capable GPIOs. Consciously avoids strapping pins.
PIN_BPM_POT 12 Analog Potentiometer (BPM) ADC channel — safe and stable for adc_oneshot_read().
PIN_VOL_POT 13 Analog Potentiometer (Volume) ADC channel — safe and stable for adc_oneshot_read().
PIN_MASTER_VOL 14 Analog Potentiometer (Master) ADC channel — safe and stable for adc_oneshot_read().
PIN_AUDIO_PDM 40 Audio Output (PDM via I2S) SAFE — Placed outside of the reserved PSRAM range.
PIN_BUZZER 41 Metronome Buzzer Independent passive buzzer driven via LEDC PWM.
PIN_CH2_PLAY 42 Looper Control (CH2 Play) Re-mapped from 48 (which was colliding with the onboard RGB LED). Safe MTMS pin after GPIO reset.
PIN_CH2_REC 38 Looper Control (CH2 Record) Tactile button (active-LOW with internal pull-up).
PIN_CH1_PLAY 20 Looper Control (CH1 Play) Tactile button (active-LOW with internal pull-up).
PIN_CH1_REC 19 Looper Control (CH1 Record) Tactile button (active-LOW with internal pull-up).

Software Design

The firmware is written in ESP-IDF (v6.x) using the C programming language. The architecture heavily utilizes FreeRTOS to divide the immense computational load of real-time audio synthesis from user interaction and sensor polling.

Firmware Overview:

  • Development Environment: ESP-IDF v6.x with the Xtensa toolchain, using CMake.
  • Third-Party Libraries: Primarily relies on native ESP-IDF drivers (I2C, LEDC, ADC, Touch Sensor, I2S).
  • Key Algorithms & Implementations:
    1. Dual-Channel Looper Engine: Replaced basic overdubbing with a highly advanced dual-channel state machine. Records timestamped NOTE_ON, NOTE_OFF, and VOL_CHANGE events relative to the metronome.
    2. Dynamic BPM Scaling: Looper playspeed is relative to the metronome's real-time BPM. The user can speed up or slow down a recorded loop on the fly without shifting its pitch.
    3. 15-Voice Polyphonic Synthesizer: Calculates sine, square, triangle, and sawtooth waveforms on the fly. Generates 16-bit PCM audio that is pushed to the I2S peripheral in PDM TX mode.
    4. Thread-Safe I2C: A custom shared_i2c.c wrapper uses FreeRTOS mutexes to allow both the OLED display (UI Task) and the VL53L1X sensor (Theremin Task) to safely communicate on the exact same physical I2C pins without corrupting data.

Firmware Architecture

The project utilizes the ESP32-S3's dual-core processor to guarantee that audio never stutters:

  • Core 0: Exclusively dedicated to the synth_task. This task runs at a very high priority, rapidly filling the I2S audio buffer to ensure zero-latency sound generation.
  • Core 1: Handles all lower-priority, asynchronous systems:
    • metronome_task: Polls the potentiometers and drives the passive buzzer precisely on beat.
    • looper_task: Manages state machines (LOOPER_RECORDING, LOOPER_PLAYING) and feeds synthetic MIDI-like events back into the synthesizer.
    • theremin_task: Polls the I2C Time-of-Flight sensor.
    • ui_task: Draws menus and dynamic overlays to the SSD1306 OLED based on rotary encoder inputs.

Hardware & Interrupt Safety

Because audio instruments require instantaneous feedback, all buttons and keys use hardware interrupts.

  • The Capacitive Keyboard triggers the touch_pad_isr.
  • The four Looper tactile buttons trigger a gpio_isr_handler.

To prevent Kernel Panics during heavy system load, the Interrupt Service Routines (ISRs) never use standard logging (ESP_LOGI) or blocking functions. They only update minimal state variables and trigger events instantly using esp_rom_printf for thread-safe debug logging.

Notable Algorithms

Dynamic BPM Playhead Scaling (looper.c):

uint32_t cur_bpm = metronome_get_bpm();
int64_t scaled_delta = (delta_us * cur_bpm) / s_channels[ch].recording_bpm;
s_channels[ch].playhead_us += scaled_delta;

Instead of recording raw audio (which takes massive memory and locks the tempo), the Looper records “events”. During playback, the time delta is multiplied by the ratio of the current BPM to the recorded BPM. This allows time-stretching loops in real-time.

Hardware Noise Gating (metronome.c):

/* Hardware deadzones to guarantee true 0% and 100% and prevent ADC boundary jitter */
if (s_master_vol <= 4) s_master_vol = 0;
if (s_current_vol <= 4) s_current_vol = 0;

Analog potentiometers inherently experience electrical noise. When turned almost all the way down, “ADC boundary jitter” would cause the metronome to randomly fluctuate between 0% and 5% volume, resulting in phantom/random beeps. This simple hardware deadzone explicitly suppresses the noise floor.

Collision-Free Polyphony (synth.c / looper.c): To ensure the live instruments and the two looper channels never steal each other's audio channels, the synthesizer allocates a massive pool of 15 voices.

  • Voices 0-4: Live Instruments (Theremin, Touch Keyboard).
  • Voices 5-9: Looper Channel 1 playback.
  • Voices 10-14: Looper Channel 2 playback.

This guarantees robust sound even during complex, multi-layered jamming.

Results Obtained

The final result is a fully functional, standalone digital instrument that requires no PC to operate. The hardware RC filter successfully converts high-frequency PDM digital signals into a smooth analog waveform that drives standard headphones. The capacitive keyboard and optical Theremin successfully mimic traditional musical input, and the Dual-Channel Looper operates flawlessly in sync with the hardware metronome.

Validation

Validation was performed iteratively:

  • Sensors: The VL53L1X and OLED initially crashed when running concurrently; this validated the absolute necessity of the FreeRTOS Mutex implemented in the I2C sharing wrapper.
  • Audio: Early oscilloscope tests showed severe clipping when multiple voices played simultaneously. This was validated and fixed by implementing a dynamic volume scaling/headroom multiplier (mixed_sample *= 0.33f) and hard clipping limits in the synth engine.
  • Hardware Logic: The Channel 2 Play button originally failed to register. Multimeter validation proved that GPIO 48 on the ESP32-S3 DevKitC-1 is hardwired to an onboard WS2812 RGB LED which was absorbing the internal pull-up resistor. The pin was successfully migrated to GPIO 42 (an unused JTAG pin).

Conclusions

The project successfully proves that advanced DSP (Digital Signal Processing) and multi-layered audio sequencing can be achieved entirely on a low-cost microcontroller. The most difficult challenge was managing the ESP32's memory constraints and routing hardware interrupts without crashing the RTOS scheduler.

If I were to build a v2.0, I would add an external I2S DAC (like a PCM5102a) for true high-fidelity 24-bit audio output, and I would design a custom PCB to eliminate the inherent electrical noise caused by breadboard jumper wires.

Download

The complete project source code is available on GitLab.

Journal

  • 4 May 2026 — Ordered the ESP32-S3, VL53L1X, OLED, and passive components.
  • 10 May 2026 — Assembled the physical breadboard chassis and wired the basic OLED and EC11 Rotary Encoder. Discovered that my VL53L1X was actually a VL53L0X
  • 19 May 2026 — Set up the ESP-IDF environment. Successfully implemented the hardware I2C sharing mutex to stop the OLED and ToF sensor from crashing each other.
  • 20 May 2026 — Built the core Software Synthesizer (Oscillators + ADSR) and constructed the DIY capacitive touch keyboard out of aluminum foil.
  • 21 May 2026 — Implemented the PDM audio output via the hardware RC filter and the LM386 amplifier. We had our first actual sound!
  • May 2026 — Added the dual-core Metronome, separating its logic to Core 1 so the audio on Core 0 wouldn't stutter.
  • 22 May 2026 — Built the first iteration of the Looper. Encountered hardware crashes caused by using JTAG pins (GPIO 39) for buttons. Re-wired and stabilized.
  • 23 May 2026 — Completely scrapped the basic overdubbing and rewrote the engine into a Dual-Channel Looper. Expanded polyphony to 15 voices to prevent voice stealing, and patched ADC boundary jitter on the metronome.
  • 24 May 2026 — Final testing, writing the documentation, and project completion.

Bibliography / Resources

  • Official ESP-IDF Documentation (FreeRTOS, I2C, I2S, LEDC, Touch Sensor)
  • ESP32-S3 Technical Reference Manual
  • VL53L1X / VL53L0X Time-of-Flight Datasheet
  • SSD1306 OLED Display Controller Datasheet
  • LM386 Audio Amplifier Datasheet
  • DSP and Audio Synthesis concepts (ADSR envelopes, Wavetable synthesis, PDM to Analog conversion) How ADSR & Envelopes Work
pm/prj2026/jan.vaduva/alexandru.albu2801.txt · Last modified: 2026/06/02 16:54 by alexandru.albu2801
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0