This is an old revision of the document!

Image Recognition

Author: Alexandru Vrabii
Email: alexandru.vrabii@stud.acs.upb.ro
Master: AAC
Academic year: 2023-2024

Introduction

Context

The idea for this project came from the desire to develop my bachelor project in a different direction. I wanted to try to develop an image recognition system based on ESP32, which is not exactly meant for image processing and analysis.The implementation of this project would lead to reduced expenses for developing an image recognition system. The ESP32-CAM board is much cheaper than a dedicated image processing board, such as the NVIDIA Jetson Nano.

Objectives

Therefore, the ideal scenario would be to collect images using the ESP32-CAM, to train a machine learning model that would learn to recognize three types of objects. The ESP32-CAM recognizes the object and sends the information to a Firebase database. Using an ESP8266, I retrieve the information from Firebase and depending on the recognized object, I control an SG90 servo motor to simulate sorting the recognized objects.

During the development of the project, I encountered several difficulties, I understand why ESP32-CAM is not the most optimal solution for an Image Recognition project. However, I manage to solve some of them and I can show a MVP of the initial idea.

TODO :

Architecture

Hardware

Components

AI Thinker ESP32-CAM - based on the ESP-32S module that integrates WIFI and Bluetooth
ESP8266 - low-cost Wi-Fi microchip, with built-in TCP/IP networking software, and microcontroller capability
OV2640 Camera - supports jpeg image output format
SG90 - Tiny and lightweight with high output power.

Circuit Diagram

TBD

Real-life view

TODO : insert a photo here

Software

Software arhitecture

Software Arhitecture have three parts:

Data acquisition and ML model training
Image Recognition and Firebase Population
Data acquisition from Firebase and decision making

Used methods Image Capturing

For video stream, I used the Eloquent library for ESP32CAM. The code used for streaming the image is part of the examples provided by the Eloquent team, example sketch that used is called 4_Video_Feed. In it, I made minor modifications so that I could connect to my personal Wi-Fi network. This code is loaded directly onto the ESP32CAM board.

For the process of acquisition, storage, processing, and training of the ML model, I used Python library everywhereml. To store pictures of the object we want to recognize, I first connected to the IP address of the ESP32CAM streaming the image. Then, I start the image collection process. I collected about 4000 pictures for 3 different classes. The classes defined by me are: 'background', which is the image of the empty background (without recognizable objects), 'alenka', which is a type of candy, and 'menthol', which is another type of candy. After the images were collected and grouped in folders with specific names, we can proceed to image processing. To optimize the training time of the ML model, the images were converted into black and white gradient, and their resolution was reduced to 40×30 pixels.

The transformed images can be used in training the ML model, we will train a RandomForest Classifier.

Note: The process of transformation and training of the classifier is a process that takes quite a lot of time.

After training the classifier, we convert the trained model into C++ libraries that can be used in the Arduino IDE and then upload them to the ESP32CAM.

After converting the code of the trained classifier into libraries compatible with the Arduino IDE, we can proceed to program the ESP32CAM for the actual image recognition. I created a sketch that connects to the personal Wi-Fi network, a Firebase database.After ESP32CAM camera is also initialized, the image recognition process takes place in the loop() area. The name of the recognized object is passed to the Firebase database. From here, this information can be used by the user for different purposes.

One scenario I applied was extracting information about the object recognized by the ESP8266 board, which has a decision node that, depending on the response received from Firebase, controls a servo motor that hypothetically can be used in a sorting process of the recognized objects.

Code & Structure

Arduino

4_Video_Feed.io - stream the video feed.

 #include "esp32cam.h"
#include "esp32cam/http/LiveFeed.h"


#define WIFI_SSID ""
#define WIFI_PASS ""
Eloquent::Esp32cam::Cam cam;
Eloquent::Esp32cam::Http::LiveFeed feed(cam, 80);


void setup() {
    Serial.begin(115200);
    delay(3000);
    Serial.println("Init");
    cam.aithinker();
    cam.highQuality();
    cam.qvga();

    while (!cam.begin())
        Serial.println(cam.getErrorMessage());

    // Connect to WiFi
    // If something goes wrong, print the error message
    while (!cam.connect(WIFI_SSID, WIFI_PASS))
        Serial.println(cam.getErrorMessage());

    //Initialize live feed http server
    // If something goes wrong, print the error message
    while (!feed.begin())
        Serial.println(feed.getErrorMessage());

    // make the camera accessible at http://esp32cam.local
    if (!cam.viewAt("esp32cam"))
        Serial.println("Cannot create alias, use the IP address");
    else
        Serial.println("Live Feed available at http://esp32cam.local");

    // display the IP address of the camera
    Serial.println(feed.getWelcomeMessage());
}


void loop() {
}

ImgRec_esp32.io - take care about image recognition process and send data to Firebase.

#include "Arduino.h"
#include "eloquent.h"
#include "eloquent/print.h"
#include "eloquent/tinyml/voting/quorum.h"
#include "eloquent/vision/camera/aithinker.h"
#include "HogPipeline.h"
#include "HogClassifier.h"
#include "Firebase_ESP_Client.h"
 
//Provide the token generation process info.
#include "addons/TokenHelper.h"
#include "WiFi.h"
//Define Firebase Data object
FirebaseData fbdo;
 
FirebaseAuth auth;
FirebaseConfig config;

//for setting server
#include "esp32cam.h"
#define WIFI_SSID "DIGI-x9kS"
#define WIFI_PASS "FkPVr3hT"


Eloquent::TinyML::Voting::Quorum<7> quorum;
String header;
String predictionLabel;

unsigned long sendDataPrevMillis = 0;
int count = 0;
bool signupOK = false;

void setup() {
  Serial.begin(115200);
  delay(3000);
  Serial.println("Begin");

Serial.print("Connecting to ");
  Serial.println(WIFI_SSID);
   //connect to wifi
  WiFi.begin(WIFI_SSID, WIFI_PASS);
  while(WiFi.status() != WL_CONNECTED){
    delay(500);
    Serial.print("+");
  }
  Serial.println("");
  Serial.println("Connected to WiFi");
  Serial.println("IP address:");
  Serial.println(WiFi.localIP());
/*-------------------------------------- */
   /* Assign the api key (required) */
  config.api_key = "AIzaSyDdfKjkcXJSKMqDhL9y4lJ37kuIoLcBwjI";
 
  /* Assign the RTDB URL (required) */
  config.database_url = "https://imagerecognition-d21b2-default-rtdb.europe-west1.firebasedatabase.app/";

   /* Sign up */
  if (Firebase.signUp(&config, &auth, "", "")){
    Serial.println("ok");
    signupOK = true;
  }
  else{
    Serial.printf("%s\n", config.signer.signupError.message.c_str());
  }
  //start the server
  //server.begin();
  
   /* Assign the callback function for the long running token generation task */
  config.token_status_callback = tokenStatusCallback; //see addons/TokenHelper.h
 
  Firebase.begin(&config, &auth);
  Firebase.reconnectWiFi(true);
  /*------------------------------------- */

  camera.qqvga();
  camera.grayscale();

  while (!camera.begin())
    Serial.println("Cannot init camera"); 
}

void loop() {
 
 

 if (Firebase.ready() && signupOK && (millis() - sendDataPrevMillis > 400 || sendDataPrevMillis == 0)){
    sendDataPrevMillis = millis();

   if (!camera.capture()) {
      Serial.println(camera.getErrorMessage());
      delay(1000);
      return;
  }
  // apply HOG pipeline to camera frame
  hog.transform(camera.buffer);

  // get a stable prediction
  uint8_t prediction = classifier.predict(hog.features);
  int8_t stablePrediction = quorum.vote(prediction);

  if (quorum.isStable()) {
      predictionLabel = classifier.getLabelOf(stablePrediction);
      Serial.println("Stable prediction: " + predictionLabel); 
  }  
  camera.free();

   // Write an String number on the database path test/int
    if (Firebase.RTDB.setString(&fbdo, "test/String", predictionLabel)){
      Serial.println("PASSED");
      Serial.println("PATH: " + fbdo.dataPath());
      Serial.println("TYPE: " + fbdo.dataType());
    }
    else {
      Serial.println("FAILED");
      Serial.println("REASON: " + fbdo.errorReason());
    }
 }
}

esp86_receiver.io - get data from Firebase and control the servo motor.

#include "Arduino.h"
#include "Firebase_ESP_Client.h"

#include "Servo.h" // servo library  
Servo s1;  

//Provide the token generation process info.
#include "addons/TokenHelper.h"
 
// Insert your network credentials
#define WIFI_SSID "DIGI-x9kS"
#define WIFI_PASSWORD "FkPVr3hT"
 
// Insert Firebase project API Key
#define API_KEY "AIzaSyDdfKjkcXJSKMqDhL9y4lJ37kuIoLcBwjI"
 
// Insert RTDB URLefine the RTDB URL */
#define DATABASE_URL "https://imagerecognition-d21b2-default-rtdb.europe-west1.firebasedatabase.app/" 
 
//Define Firebase Data object
FirebaseData fbdo;
 
FirebaseAuth auth;
FirebaseConfig config;
 
unsigned long sendDataPrevMillis = 0;
String stringValue;
bool signupOK = false;
 
void setup() {
  Serial.begin(115200);
  WiFi.begin(WIFI_SSID, WIFI_PASSWORD);
  Serial.print("Connecting to Wi-Fi");
  while (WiFi.status() != WL_CONNECTED) {
    Serial.print(".");
    delay(300);
  }
  Serial.println();
  Serial.print("Connected with IP: ");
  Serial.println(WiFi.localIP());
  Serial.println();
 
  /* Assign the api key (required) */
  config.api_key = API_KEY;
 
  /* Assign the RTDB URL (required) */
  config.database_url = DATABASE_URL;
 
  /* Sign up */
  if (Firebase.signUp(&config, &auth, "", "")) {
    Serial.println("ok");
    signupOK = true;
  }
  else {
    Serial.printf("%s\n", config.signer.signupError.message.c_str());
  }
 
  /* Assign the callback function for the long running token generation task */
  config.token_status_callback = tokenStatusCallback; //see addons/TokenHelper.h
 
  Firebase.begin(&config, &auth);
  Firebase.reconnectWiFi(true);
 // Serial.println("Leaving Setup");

 s1.attach(0);  // servo attach D3 pin of arduino  
}

void loop() {
    //Serial.println("Enter in Loop");

  if (Firebase.ready() && signupOK && (millis() - sendDataPrevMillis > 400 || sendDataPrevMillis == 0)) {
      //Serial.println("Enter in IF stattement of Firebase");
    sendDataPrevMillis = millis();
    if (Firebase.RTDB.getString(&fbdo, "/test/String")) {
      if (fbdo.dataType() == "string") {
        stringValue = fbdo.stringData();
        Serial.println(stringValue);
        if ( stringValue == "alenka")
            {
              s1.write(90);
              }
        else
        {
          s1.write(0);
          }
      }
    }
    else {
      Serial.println(fbdo.errorReason());
    }     
  }
}

Python

Image acquisition

#Collect images from Esp32-cam web server
from logging import basicConfig, INFO
from everywhereml.data import ImageDataset
from everywhereml.data.collect import MjpegCollector

base_folder = 'IOT_Captures_web'
IP_ADDRESS_OF_ESP = 'http://192.168.101.35:81'
basicConfig(level=INFO)

try:
  
    image_dataset = ImageDataset.from_nested_folders(
        name='Candies',  
        base_folder=base_folder
    )
except FileNotFoundError:
  
    mjpeg_collector = MjpegCollector(address=IP_ADDRESS_OF_ESP)
    image_dataset = mjpeg_collector.collect_many_classes(
        dataset_name='Candies', 
        base_folder=base_folder,
        duration=30
    )
  
print(image_dataset)

Image transformation

from test import image_dataset
from everywhereml.preprocessing.image.object_detection import HogPipeline
from everywhereml.preprocessing.image.transform import Resize

image_dataset = image_dataset.gray().uint8()

pipeline = HogPipeline(
    transforms=[
        Resize(width=40, height=30)
    ]
)

# Convert images to feature vectors
feature_dataset = pipeline.fit_transform(image_dataset)
feature_dataset.describe()

Train the Image Recognition model

from everywhereml.sklearn.ensemble import RandomForestClassifier

import pipeline as pipeline
from pipeline import feature_dataset

for i in range(10):
    clf = RandomForestClassifier(n_estimators=500, max_depth=5)

    # fit on train split and get accuracy on the test split
    train, test = feature_dataset.split(test_size=0.4, random_state=i)
    clf.fit(train)

    print('Score on test set: %.2f' % clf.score(test))

clf.fit(feature_dataset)

Transform model in C++ code library

from pipeline import pipeline
from pipeline import feature_dataset

print(pipeline.to_arduino_file(
    filename=r'C:\Users\xiaomi\Desktop\Master an 2\IOT\Proiect_IOT\HogPipeline.h',
    instance_name='hog'
))


from ML_train import clf

print(clf.to_arduino_file(
    filename=r'C:\Users\xiaomi\Desktop\Master an 2\IOT\Proiect_IOT\HogClassifier.h',
    instance_name='classifier', 
    class_map=feature_dataset.class_map
))

Results

Conclusion

References

Hardware components documentation and datasheets

External Libraries

Arduino

Javascript/CSS

Python

Litterature

AAC Administrative Resources

Attendance

Grading

AAC Lectures (2025-2026)

AAC Labs (2025-2026)

SRIC Administrative Resources

Notare
Catalog
Calendar

SRIC Lectures (2025)

SRIC Labs (2024-2025)

SRIC Labs (2023)

SRIC Projects (2023)

Descriere Proiecte

SRIC Archive

Projects

Image Recognition

iothings/proiecte/2023/imgrecsystem.1705163004.txt.gz · Last modified: 2024/01/13 18:23 by alexandru.vrabii

Old revisions

Media Manager Back to top