The project's main goal is to offer the user the possibility to stream music over Bluetooth to ESP32 which will forward the received music to a DAC. Thus, this project enables Bluetooth audio connection to speakers which lack Bluetooth support. Thanks to existing libraries, this is a fairly easy thing to implement. In order to make the project more complex and enable means of communication of ESP32 other than Bluetooth, I decided to add usage statistics tracking. In this page, I will cover general description, hardware design, software design, implementation highlights and challenges faced developing this project. The music streaming is implemented with A2DP Bluetooth profile, which is widely used in real-world implementations of wireless speakers, headphones or car stereos. Together with AVRCP profile, statistics and information was gathered.
This is a flow chart which represents state flow of ESP32.
Simultaneous use of Wi-Fi and Bluetooth is reportedly problematic and that's why they work mutually exclusive in this project. Gathering local statistics means gathering the artist, album, title and the cover art (if supported) of the currently playing song. Also, it means collecting the number of plays and total play time for each recorded song. The play time is the total time when song was in the 'play' state. For a play to count, a song must have a play time of a minimum 5 seconds in a playback session.
Parts used:
The pushbutton allows to disconnect currently connected Bluetooth device. Also, if it is pressed on ESP32's startup, SPIFFS will be formatted, which means that all unsent info, known Wi-Fi networks and settings will be erased. The LED shows current status of the board. Blinking LED means that the board waits for user action: blue blinking – waiting for BT connection, yellow blinking – waiting for Wi-Fi configuration (ESP32 is in AP mode). Constant blue lighting means that there is Bluetooth device connected. Constant yellow lighting indicates ongoing connection to Wi-Fi (ESP32 in station mode) or other data transfer over Wi-Fi. The LED may also have red lighting, which indicates that ESP32 ran into an error, but usually it is not noticeable, because the ESP32 exits the error state by restarting, or by other means. PCM5102A was chosen because it can deliver PCM quality stereo-sound, with sampling frequency up to 384 kHz and resolution up to 32 bits. However, the maximum sampling rate is capped at 48kHz by SBC codec. According to subjective sound quality evaluation, the sound is good and clear, with stable transmission without stuttering.
The software part of this project is composed of:
Sketch uses 1783898 bytes (85%) of program storage space. Maximum is 2097152 bytes. Global variables use 62900 bytes (19%) of dynamic memory, leaving 264780 bytes for local variables. Maximum is 327680 bytes.
Partition scheme `No OTA (2 MB APP/2 MB SPIFFS)`
Achieved with pschatzmann/ESP32-A2DP and pschatzmann/arduino-audio-tools libraries. Although, those libraries provide convenient A2DP sink functionality, as well as AVRCP event callbacks, for example when song title was received, it lacks the support of cover art transmission, which is included in recent versions of ESP-IDF. That being said, after numerous of attempts, the following code (a massive callback) was received, which represents the core of this project. It stores album art together with other information. A notable fact is that it messes with ESP-IDF API, in combination with the existing library, basically adding cover art transmission functionality on top of library's functionalities. Setting the ESP32 in valid Bluetooth A2DP sink and AVRCP state is handled by the library, while custom ESP-IDF AVRCP callback is a custom implementation.
void avrc_callback(esp_avrc_ct_cb_event_t event, esp_avrc_ct_cb_param_t * param) { if (event == ESP_AVRC_CT_CONNECTION_STATE_EVT) { if (param -> conn_stat.connected) { Serial.println("Connected"); op_state = BT; if (no_stats_g == 1) { return; } got_features = false; got_cover_art_properties = false; cover_art_properties_written_bytes = 0; got_cover_art = false; timerStop(play_timer); timerWrite(play_timer, 0); a2dp_sink.pause(); a2dp_sink.set_volume(50); register_notification(); request_metadata(); } else { Serial.println("Disconnected"); op_state = BT_WAIT; if (no_stats_g == 1) { return; } got_features = false; got_cover_art_properties = false; cover_art_properties_written_bytes = 0; got_cover_art = false; char tmp_buf[512]; if (strlen(current_artist) > 1 && strlen(current_title) > 1) { int ret = snprintf(tmp_buf, sizeof(tmp_buf) - 1, "%s\t%s\t%s\t%.2f\n", current_artist, current_album, current_title, timerReadSeconds(play_timer)); if (ret <= sizeof(tmp_buf) - 1) { appendFile(SPIFFS, "/spiffs/stats.txt", tmp_buf); } } timerStop(play_timer); timerWrite(play_timer, 0); ESP.restart(); } } else if (event == ESP_AVRC_CT_CHANGE_NOTIFY_EVT) { register_notification(); if (param -> change_ntf.event_id == ESP_AVRC_RN_PLAY_STATUS_CHANGE) { bool is_playing = param -> change_ntf.event_parameter.playback == ESP_AVRC_PLAYBACK_PLAYING; if (is_playing) { Serial.println("Now playing"); timerStart(play_timer); Serial.printf("Seconds played: %.2f\n", timerReadSeconds(play_timer)); } else { Serial.println("Now paused"); timerStop(play_timer); Serial.printf("Seconds played: %.2f\n", timerReadSeconds(play_timer)); } } else if (param -> change_ntf.event_id == ESP_AVRC_RN_TRACK_CHANGE) { Serial.println("Track changed"); request_metadata(); got_cover_art = false; Serial.printf("Cover art timer restarted\n"); timerStop(cover_art_timer); timerWrite(cover_art_timer, 0); timerStart(cover_art_timer); char tmp_buf[512]; if (strlen(current_artist) > 1 && strlen(current_title) > 1) { int ret = snprintf(tmp_buf, sizeof(tmp_buf) - 1, "%s\t%s\t%s\t%.2f\n", current_artist, current_album, current_title, timerReadSeconds(play_timer)); if (ret <= sizeof(tmp_buf) - 1) { appendFile(SPIFFS, "/spiffs/stats.txt", tmp_buf); } } timerWrite(play_timer, 0); } } else if (event == ESP_AVRC_CT_METADATA_RSP_EVT) { int ret; char tmp_buf[param -> meta_rsp.attr_length + 1]; if (param -> meta_rsp.attr_id != ESP_AVRC_MD_ATTR_COVER_ART) { memset(tmp_buf, 0, sizeof(tmp_buf)); strncpy(tmp_buf, (char * ) param -> meta_rsp.attr_text, param -> meta_rsp.attr_length); } switch (param -> meta_rsp.attr_id) { case ESP_AVRC_MD_ATTR_TITLE: memcpy(current_title, tmp_buf, sizeof(current_title) - 1); Serial.printf("Title: %s\n", current_title); break; case ESP_AVRC_MD_ATTR_ARTIST: memcpy(current_artist, tmp_buf, sizeof(current_artist) - 1); Serial.printf("Artist: %s\n", current_artist); break; case ESP_AVRC_MD_ATTR_ALBUM: memcpy(current_album, tmp_buf, sizeof(current_album) - 1); Serial.printf("Album: %s\n", current_album); break; case ESP_AVRC_MD_ATTR_COVER_ART: memcpy(image_handle, (uint8_t * ) param -> meta_rsp.attr_text, ESP_AVRC_CA_IMAGE_HANDLE_LEN * sizeof(uint8_t)); Serial.printf("Image handle\n"); break; } } else if (event == ESP_AVRC_CT_REMOTE_FEATURES_EVT) { Serial.printf("Features\n"); if (!got_features && current_title[0] != '\0') { got_features = true; request_cover_art(); } } else if (event == ESP_AVRC_CT_COVER_ART_STATE_EVT) { if (param -> cover_art_state.state != ESP_AVRC_COVER_ART_CONNECTED) { cover_art_properties_written_bytes = 0; Serial.printf("Cover art disconnected. Reason: %d Got properties: %d\n", param -> cover_art_state.reason, got_cover_art_properties); got_cover_art_properties = false; if (!got_cover_art) { Serial.printf("Cover art timer restarted\n"); timerStop(cover_art_timer); timerWrite(cover_art_timer, 0); timerStart(cover_art_timer); } return; } esp_err = esp_avrc_ct_cover_art_get_image_properties(image_handle); if (esp_err != ESP_OK) { Serial.printf("ERROR: esp_avrc_ct_cover_art_get_image() failed: %s\n", esp_err_to_name(esp_err)); op_state = ERROR; } } else if (event == ESP_AVRC_CT_COVER_ART_DATA_EVT) { static File cover_art_file; if (param -> cover_art_data.status != ESP_BT_STATUS_SUCCESS) { got_cover_art_properties = false; cover_art_properties_written_bytes = 0; if (cover_art_file) { cover_art_file.close(); } Serial.printf("Cover art data failed. Status: %d\n", param -> cover_art_data.status); return; } if (!got_cover_art_properties) { image_descriptor = (uint8_t * ) realloc(image_descriptor, cover_art_properties_written_bytes + param -> cover_art_data.data_len); memcpy(image_descriptor + cover_art_properties_written_bytes, param -> cover_art_data.p_data, param -> cover_art_data.data_len); cover_art_properties_written_bytes += param -> cover_art_data.data_len; if (param -> cover_art_data.final) { got_cover_art_properties = true; Serial.printf("Got properties with size: %u\n", cover_art_properties_written_bytes); esp_err = esp_avrc_ct_cover_art_get_image(image_handle, image_descriptor, cover_art_properties_written_bytes); if (esp_err != ESP_OK) { Serial.printf("ERROR: esp_avrc_ct_cover_art_get_image() failed: %s\n", esp_err_to_name(esp_err)); op_state = ERROR; } char tmp_buf[16]; sprintf(tmp_buf, "/spiffs/art%u", cover_arts_received); cover_art_file = SPIFFS.open(tmp_buf, FILE_WRITE); if (!cover_art_file) { Serial.printf("%s - failed to open file for appending\n", tmp_buf); return; } Serial.printf("Writing cover art to %s\n", tmp_buf); char metadata[512]; int ret = 0; if (strlen(current_artist) > 1 && strlen(current_title) > 1) { ret = snprintf(metadata, sizeof(metadata) - 1, "%s\t%s\t%s\t", current_artist, current_album, current_title); if (ret > sizeof(metadata) - 1) { Serial.printf("ERROR: Metadata injection out of bounds\n"); op_state = ERROR; return; } } if (cover_art_file) { if (cover_art_file.write((uint8_t *)metadata, ret) < ret) { Serial.printf("ERROR: Cover art write to file failed or incomplete\n"); op_state = ERROR; cover_art_properties_written_bytes = 0; got_cover_art_properties = false; if (!got_cover_art) { Serial.printf("Cover art timer restarted\n"); timerStop(cover_art_timer); timerWrite(cover_art_timer, 0); timerStart(cover_art_timer); } cover_art_file.close(); return; } } else { Serial.printf("ERROR: Cover art file not open or open failed\n"); op_state = ERROR; cover_art_properties_written_bytes = 0; got_cover_art_properties = false; if (!got_cover_art) { Serial.printf("Cover art timer restarted\n"); timerStop(cover_art_timer); timerWrite(cover_art_timer, 0); timerStart(cover_art_timer); } return; } } } else { if (cover_art_file) { if (cover_art_file.write(param -> cover_art_data.p_data, param -> cover_art_data.data_len) < param -> cover_art_data.data_len) { Serial.printf("ERROR: Cover art write to file failed or incomplete\n"); op_state = ERROR; cover_art_file.close(); return; } } else { Serial.printf("ERROR: Cover art file not open or open failed\n"); op_state = ERROR; return; } if (param -> cover_art_data.final) { cover_art_file.flush(); got_cover_art_properties = false; cover_art_properties_written_bytes = 0; got_cover_art = true; if (cover_art_file.size() == 0) { char tmp_buf[16]; sprintf(tmp_buf, "/spiffs/art%u", cover_arts_received); cover_art_file.close(); if (!SPIFFS.remove(tmp_buf)) { Serial.printf("WARNING: Zero-sized file %s delete failed.\n", tmp_buf); } } else { Serial.printf("Cover Art Data Processing Complete.\n"); cover_arts_received++; set_received_cover_arts_count(cover_arts_received); cover_art_file.close(); } esp_err = esp_avrc_ct_cover_art_disconnect(); if (esp_err != ESP_OK) { Serial.printf("ERROR: esp_avrc_ct_cover_art_disconnect() failed: %s\n", esp_err_to_name(esp_err)); op_state = ERROR; } } } } else { Serial.printf("Unhandled event type: %d\n", event); } }
Of course, there is a lot of other custom code which handles the rest of the functionality, which is to be described further.
The statistics transmission is triggered if there is anything to send. Locally, the information is stored in `/spiffs/stats.txt` with the `<artist>\t[album]\t<title>\n` format. Cover arts are stored in individual files `/spiffs/artN` files. Cover art files are usually `.jpg` and, in order to correspond cover art to a track correctly, `<artist>\t[album]\t<title>\n` is injected in `.jpg` binary before the actual `.jpg` header and data starts. This information is known by statistics receiver and cover arts are parsed and organized accordingly. If ESP32 never connected to a Wi-Fi network with Internet access, or the transmission failed for other reasons, local data will not be removed. On transmission success, the local data becomes redundant, thus it is deleted from local storage.
In order to provide a complete user experience, ESP32 code must adapt to environment changes. The user is able to provide more Wi-Fi networks to which ESP32 will try to connect prior to sending local statistics. ESP32 will try for 10 seconds to connect to each known Wi-Fi network. If ESP32 fails to connect to every known network, or no network is provided at all, it enters in Wi-Fi AP mode. If a smartphone or PC connects to the ESP32 hotspot, a webpage will pop-up (Captive Portal), which will prompt the user to introduce Wi-Fi credentials of a valid network, or to continue without recording statistics. If the user opts for ignoring statistics, the choice will be remembered, and it can be reset by holding the red pushbutton on ESP32 startup (basically, press ESP32's reset while holding the pushbutton). This information is also given to user on the captive webpage.
There is a Python server in the Internet iot.echipa3.xyz:50993
(domain and hosting inherited from other project) which listens for already described information in Bluetooth music streaming section. It stores the information in local volume managed by Docker, as a directory tree: artist/[album]/title/plays.txt,time.txt,art.jpg
.
There is a simple frontend server by nginx on https://iot.echipa3.xyz
which shares the data volume with the Python statistics server and presents it in a neat form to the user. The information on it is update dynamically.
E (13054) task_wdt: esp_task_wdt_reset(705): task not found
spam on boot until reaches ~20000.Implemented a working solution which forwards Bluetooth sound to mini-jack. An adequate user interface is provided, Wi-Fi credentials provisioning, status LED and pushbutton. The solution may still crash randomly, though it was used without issues while writing this documentation (earphones connected to jack and music streaming from an iPhone). As the code was mainly tested with an iPhone, there may be issues with other source devices, for example, Windows 11 laptop with web version of YouTube Music, doesn't report the album, neither it's cover art. The Python server may also refuse to receive statistics sometimes. This all means that the code requires polishing on all levels.
[https://medium.com/@atacanymc/creating-a-captive-portal-with-esp32-a-step-by-step-guide-9e9f78ab87b8|Creating a Captive Portal with ESP32: A Step-by-Step Guide]] highly modified in my implementation