AI on the Edge: ESP32-CAM & TinyML

Edge AI is transforming how devices think — no cloud, no delay, just on-board intelligence.
The ESP32-CAM merges Wi-Fi, camera, and compute power in a few dollars of silicon, while TinyML enables deep-learning inference directly on the microcontroller.

🤖 Why TinyML on ESP32 Matters

TinyML lets microcontrollers run machine-learning models that once needed servers.
With ESP32-CAM, you can deploy models that see, hear, and act instantly, without sending data off-device.

Key Advantages

Low Latency: Inference happens locally in milliseconds.
Data Privacy: Images or signals never leave the device.
Network Independence: Works even with intermittent connectivity.
Cost Efficiency: less than USD 10 hardware and minimal cloud usage.
Scalability: Thousands of devices can process and act independently.

🏭 Industrial Use Cases

Domain	Applications
Manufacturing	Defect detection, assembly verification, tool-wear analysis
Smart Buildings	Occupancy detection, lighting/HVAC automation
Retail	Shelf stock monitoring, queue length detection
Safety	PPE detection, restricted-area alerts
Utilities & Energy	Flame/smoke detection, gauge reading
Agritech	Pest spotting, crop-health monitoring

Each use case benefits from running inference at the source, reducing bandwidth and increasing reliability.

⚙️ Edge AI Architecture

Sensor (Camera / ADC)
        ↓
   Pre-Processing
        ↓
  TinyML Model (TFLite Micro)
        ↓
   Local Action (LED / Relay / MQTT)
        ↓
   Optional Cloud Dashboard

This architecture allows instant decision-making while keeping cloud components lightweight.

🧩 Hands-On: Person Detection Demo (ESP32-CAM)

A minimal TinyML person-detection project running locally on ESP32-CAM. This demo uses the TensorFlow Lite Micro person_detection model and toggles the onboard LED when a person is detected.

/*
 * Project: ESP32-CAM TinyML Person Detection
 * Description:
 *   Runs a TensorFlow Lite Micro model for person detection on the AI Thinker ESP32-CAM.
 *   Captures grayscale frames, resizes to 96×96, performs inference locally,
 *   and toggles the onboard flash LED when a person is detected.
 *
 * Hardware: AI Thinker ESP32-CAM (OV2640)
 * Framework: Arduino (PlatformIO)
 * Author: Sony Sunny
 * Date: 2025-10-22
 */
 
 
#include <Arduino.h>          // Core Arduino functions
#include "esp_camera.h"       // ESP32-CAM camera driver
 
// === Pin definitions for the AI Thinker ESP32-CAM ===
// These map the ESP32 GPIOs to the camera’s physical pins.
#define PWDN_GPIO_NUM     32   // Power down pin (turns camera off/on)
#define RESET_GPIO_NUM    -1   // Reset not used
#define XCLK_GPIO_NUM      0   // XCLK signal for camera clock
#define SIOD_GPIO_NUM     26   // I2C data line to SCCB (camera control bus)
#define SIOC_GPIO_NUM     27   // I2C clock line
#define Y9_GPIO_NUM       35   // Data pin 9
#define Y8_GPIO_NUM       34   // Data pin 8
#define Y7_GPIO_NUM       39   // Data pin 7
#define Y6_GPIO_NUM       36   // Data pin 6
#define Y5_GPIO_NUM       21   // Data pin 5
#define Y4_GPIO_NUM       19   // Data pin 4
#define Y3_GPIO_NUM       18   // Data pin 3
#define Y2_GPIO_NUM        5   // Data pin 2
#define VSYNC_GPIO_NUM    25   // Vertical sync signal
#define HREF_GPIO_NUM     23   // Horizontal reference signal
#define PCLK_GPIO_NUM     22   // Pixel clock signal
 
// === TensorFlow Lite Micro headers ===
#include "tensorflow/lite/micro/all_ops_resolver.h"  // Registers all supported ops
#include "tensorflow/lite/micro/micro_interpreter.h" // Runs inference on microcontrollers
#include "tensorflow/lite/schema/schema_generated.h" // TFLite model schema
#include "tensorflow/lite/version.h"                 // Version check helper
#include "person_detect_model_data.h"                // Compiled TinyML model (array of bytes)
 
// === Model input settings ===
static const int kNumCols = 96;       // Model input width
static const int kNumRows = 96;       // Model input height
static const int kNumChannels = 1;    // Grayscale image = 1 channel
static const int kTensorArenaSize = 220 * 1024; // Working memory (RAM) for inference
static uint8_t tensor_arena[kTensorArenaSize];  // Memory buffer used by TFLM
 
// === TensorFlow model + interpreter pointers ===
const tflite::Model* model = nullptr;
tflite::MicroInterpreter* interpreter = nullptr;
TfLiteTensor* input = nullptr;
 
// Flash LED pin on AI Thinker board
const int FLASH_LED_PIN = 4;
 
// -----------------------------------------------------------------------------
// Function: resize_to_96x96_grayscale
// Downsamples a larger grayscale frame (160x120) to the 96x96 size
// expected by the TinyML model. Uses a simple nearest-neighbor method.
// -----------------------------------------------------------------------------
static bool resize_to_96x96_grayscale(uint8_t* dst, int dw, int dh,
                                      const uint8_t* src, int sw, int sh) {
  if (!dst || !src) return false;          // Sanity check
  int step_x = sw / dw;                    // Horizontal sampling step
  int step_y = sh / dh;                    // Vertical sampling step
  for (int y = 0; y < dh; y++) {
    for (int x = 0; x < dw; x++) {
      dst[y * dw + x] = src[(y * step_y) * sw + (x * step_x)];
    }
  }
  return true;
}
 
// -----------------------------------------------------------------------------
// Function: init_camera
// Configures and initializes the ESP32-CAM peripheral.
// Returns true if camera setup succeeds.
// -----------------------------------------------------------------------------
static bool init_camera() {
  camera_config_t config = {};             // Initialize configuration struct
  config.ledc_channel = LEDC_CHANNEL_0;    // LEDC timer channel for XCLK PWM
  config.ledc_timer   = LEDC_TIMER_0;
  config.pin_d0       = Y2_GPIO_NUM;
  config.pin_d1       = Y3_GPIO_NUM;
  config.pin_d2       = Y4_GPIO_NUM;
  config.pin_d3       = Y5_GPIO_NUM;
  config.pin_d4       = Y6_GPIO_NUM;
  config.pin_d5       = Y7_GPIO_NUM;
  config.pin_d6       = Y8_GPIO_NUM;
  config.pin_d7       = Y9_GPIO_NUM;
  config.pin_xclk     = XCLK_GPIO_NUM;
  config.pin_pclk     = PCLK_GPIO_NUM;
  config.pin_vsync    = VSYNC_GPIO_NUM;
  config.pin_href     = HREF_GPIO_NUM;
  config.pin_sscb_sda = SIOD_GPIO_NUM;
  config.pin_sscb_scl = SIOC_GPIO_NUM;
  config.pin_pwdn     = PWDN_GPIO_NUM;
  config.pin_reset    = RESET_GPIO_NUM;
  config.xclk_freq_hz = 20000000;          // 20 MHz camera clock
  config.pixel_format = PIXFORMAT_GRAYSCALE; // Capture grayscale images
  config.frame_size   = FRAMESIZE_QQVGA;     // 160×120 resolution
  config.fb_count     = 2;                   // Two frame buffers
 
  // Initialize the camera driver
  return (esp_camera_init(&config) == ESP_OK);
}
 
// -----------------------------------------------------------------------------
// setup()
// Runs once at startup.
// -----------------------------------------------------------------------------
void setup() {
  Serial.begin(115200);          // Start serial console
  delay(300);
  Serial.println("\n[ESP32-CAM] TinyML Person Detection");
 
  // Prepare LED (used as output indicator)
  pinMode(FLASH_LED_PIN, OUTPUT);
  digitalWrite(FLASH_LED_PIN, LOW);
 
  // Initialize camera
  if (!init_camera()) {
    Serial.println("Camera init failed");
    while (true) delay(1000);    // Halt here if setup fails
  }
 
  // Load the compiled TensorFlow Lite model from flash
  model = tflite::GetModel(g_person_detect_model_data);
 
  // Build interpreter — this binds the model, operations, and tensor arena
  static tflite::AllOpsResolver resolver;  // Includes all operators
  static tflite::MicroInterpreter static_interpreter(
      model, resolver, tensor_arena, kTensorArenaSize);
  interpreter = &static_interpreter;
 
  // Allocate input/output tensors inside the tensor arena
  interpreter->AllocateTensors();
 
  // Pointer to input tensor for convenience
  input = interpreter->input(0);
}
 
// -----------------------------------------------------------------------------
// loop()
// Captures frames continuously, preprocesses them, runs inference,
// and lights the LED if a person is detected.
// -----------------------------------------------------------------------------
void loop() {
  // Capture a frame from the camera
  camera_fb_t* fb = esp_camera_fb_get();
  if (!fb) return; // Skip if capture failed
 
  // Resize the camera frame (160x120) to 96x96 for the model input
  resize_to_96x96_grayscale(input->data.uint8, kNumCols, kNumRows,
                            fb->buf, fb->width, fb->height);
 
  // Release the frame buffer so camera can capture next frame
  esp_camera_fb_return(fb);
 
  // Run inference using TensorFlow Lite Micro
  if (interpreter->Invoke() == kTfLiteOk) {
    // Fetch output tensor (contains model results)
    TfLiteTensor* output = interpreter->output(0);
 
    // Convert quantized int8 output to floating-point probability
    float person_score = (output->data.int8[1] - output->params.zero_point)
                         * output->params.scale;
 
    // Print detection confidence to Serial
    Serial.printf("person_score=%.2f\n", person_score);
 
    // Turn on flash LED if confidence > 0.6
    digitalWrite(FLASH_LED_PIN, person_score > 0.6f ? HIGH : LOW);
  }
 
  // Small delay before next frame
  delay(200);
}

✅ Result: The onboard flash LED lights when a person appears in the frame — all inference done locally.

🧠 TinyML Starter Template (Takeaway)

This starter code runs a custom Edge Impulse model on any ESP32 (e.g., ESP32-Dev or ESP32-CAM). Just replace the #include headers with your exported EI library.

#include <Arduino.h>
#include "edge-impulse-sdk/classifier/ei_run_classifier.h"
#include "model-parameters/model_parameters.h"
 
const int LED_PIN = 2;
 
void setup() {
  Serial.begin(115200);
  pinMode(LED_PIN, OUTPUT);
  Serial.println("TinyML Starter running...");
}
 
void loop() {
  static float features[EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE] = {0};
  // Fill features[] with sensor or preprocessed data here
 
  signal_t signal;
  numpy::signal_from_buffer(features, EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE, &signal);
  ei_impulse_result_t result;
  if (run_classifier(&signal, &result, false) == EI_IMPULSE_OK) {
    float best = 0; const char* label = "";
    for (auto &c : result.classification) if (c.value > best) { best = c.value; label = c.label; }
    Serial.printf("%s: %.2f\n", label, best);
    digitalWrite(LED_PIN, best > 0.7f ? HIGH : LOW);
  }
  delay(200);
}

🎥 Video Demo (Coming Soon)

A live demonstration of ESP32-CAM detecting a person in real time without cloud inference will be added after November 20.

Stay tuned — the video will show inference timing, LED response, and live frame output.

💾 GitHub-Ready Code — Takeaway!

📁 esp32cam-tinyml-demo/
 ┣ 📂 src/
 ┃ ┣ main.cpp
 ┃ ┗ person_detect_model_data.h
 ┣ 📄 platformio.ini
 ┗ README.md

Clone → build with PlatformIO → upload → see your MCU think.