This is an old revision of the document!

Pet recognition and monitoring system

Author: Loghinescu Nicu Adrian
Email: nicu.loghinescu@stud.fim.upb.ro
Master: ACES

Overiew

Objective

The purpose of this project is to monitor specific behavior of your house pets. ESP32-CAM boards with cameras and motion detectors will be placed near areas of interest such as the pet feeder and will send a picture to a firebase server whenever the motion sensor is triggered. The server will use a ML model to recognize your specific pets and save specific data to a database, which can be used to track the pet's behavior.

Hardware

AI Thinker ESP32-CAM - based on the ESP-32S module that integrates WIFI and Bluetooth
OV2640 Camera - supports jpeg image output format
PIR HC-SR501 Sensor - motion sensor based on the Passive Infrared technology
FT232RL USB to Serial Converter - for programming the ESP32-CAM

Most of the difficulty on the hardware side comes from programming the esp32, which does not have a USB port. The PIR sensor only has to be connected to a power source and it's output to the GPIO13 of the esp32.

Software and services

Firebase
- Cloud functions - Used as a HTTP server
- Realtime database - For saving timestamps regarding pet detection
- Firebase Hosting - Simple Web interface for monitoring pet behavior and trying different ML models
Google Teachable Machine - Classifier generator from Google that outputs Tensorflow models
TensorflowJS - Tensorflow library for Javascript, will be used in the Cloud functions
Arduino IDE alongside the required libraries for ESP32, HTTP client, WiFi, Firebase client and video camera

Architecture

Shortly, the architecture has classic client-server structure, where the esp32-cam is the http client and firebase the server. The other software and harware components are built on top of these two.

Application setup

After programming the ESP32 and deploying the firebase code, the user will need to upload a TensorflowJS model in order to detect their house pets.
Google's Teachable Machine provides an easy way to generate classifiers for an average user. All the end user has to do is to provide pictures and select the desired cathegory
Google authomatically stores the model and provides a link to it. After the model has been generated, the user should insert the link in the Web interface that we provide with firebase hosting.

How it works

We will now refer to the ESP32-CAM as the HTTP client and to the cloud functions as the server.
The client will permanently check the input pin from the motion detector. Once motion is detected, a picture is taken and sent to the server.
The server will load the model, if it's not already loaded, make a prediction on the received image and send the result back to the client.
If the HTTP exchange was successful, the client will send the prediction result alongside a timestamp in the UNIX format.

How the Web interface works

The web interface is completely based on firebase hosting.
The main page will present a textbox in order to provide a link to a new model and a playground area where you can drag and drop pictures in order to test the model.
It will also provide an interface for visualizing the data stored in the realtime database by the esp32.

Software architecure - ESP32

As previously described, the actions taken by the ESP32 are quite simple, and described in the following flowchart. The PIR sensor does most of the detection work, and we simply read GPIO13 in order to see if the motion was detected. If motion is detected, then we take a picture and send to to the server. If the exchange is successful and receive the HTTP OK code, we send the prediction result alongside a timestamp to the realtimedb

Software architecure - Firebase

The server will provide only two functions, which have quite simple functionality.

Cloud functions: /predictImage
- This function will wait for a request containing an image and respond with the predicted content of that image.
Cloud functions: /changeModel
- This function will wait for a request containing a link to a new model, and will save that link to a file, from which it will be read by the /predictImage function when loading the model. It will also invalidate the currently loaded model, so that it will be reloaded.