esphome icon indicating copy to clipboard operation
esphome copied to clipboard

Modular Camera Framework with JPEG Encoding Support.

Open DT-art1 opened this issue 1 year ago • 16 comments

What does this implement/fix?

This pull request introduces a new modular camera framework in ESPHome, designed to support simpler camera modules like the MLX90640 thermal camera. It also adds support for the software JPEG encoder from bitbank2/JPEGENC, enabling efficient image compression and streaming, especially on resource-constrained devices like the ESP32-S3.

The goal is to provide an easy way to integrate camera streams into Home Assistant, even for devices with limited processing power, by utilizing software JPEG encoding for small images.

Key Features

  • Incremental image capture and JPEG encoding support.
  • Incremental overlays during camera capture.
  • A new camera_loop() method enables non-blocking, incremental processing of capture and encoding tasks.
  • A general-purpose camera base class for easy addition of new camera sensors.
  • A standalone camera component implementation for direct integration with Home Assistant.
  • Support for software JPEG encoding using the bitbank/JPEGENC library, allowing efficient compression of images for streaming.
  • Initial focus on the MLX90640 thermal camera, which works well with the ESP32-S3 and can be efficiently encoded to JPEG format.
  • Optimized for smaller cameras, enabling resource-constrained devices to capture and stream images with minimal overhead.

Proof of Concept

A working example using this pull request is available in the following repository, demonstrating the integration of the MLX90640 camera with Home Assistant, utilizing JPEG encoding for efficient image streaming:

https://github.com/DT-art1/esphome-mlx90640/tree/dependent-on-pr-7639

This POC demonstrates the camera component's ability to capture small thermal images, encode them to JPEG, and stream them to Home Assistant.

Motivation

ESPHome previously lacked an easy and efficient way to integrate smaller camera modules like the MLX90640 into Home Assistant. By adding support for software JPEG encoding, this pull request enables efficient compression of images and makes it possible to stream them to Home Assistant with minimal resource overhead.

This new framework simplifies the integration of camera sensors, particularly for home automation and monitoring, and provides a clean, reusable solution for developers wanting to add camera support to ESPHome.

Additional Considerations

  • JPEG encoding is performed using the software-based bitbank2/JPEGENC library, which is efficient for smaller images but may not scale well to larger images.
  • This feature makes it possible to stream high-quality images with low resource usage, ideal for devices like ESP32-S3 with limited hardware encoding capabilities.

Types of changes

  • [ ] Bugfix (non-breaking change which fixes an issue)
  • [X] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [ ] Other

Related issue or feature (if applicable): fixes

Pull request in esphome-docs with documentation (if applicable): esphome/esphome-docs#4956

Test Environment

  • [X] ESP32
  • [X] ESP32 IDF
  • [ ] ESP8266
  • [ ] RP2040
  • [ ] BK72xx
  • [ ] RTL87xx

Example entry for config.yaml:

# Example config.yaml
esphome:
  name: test-camera

esp32:
  board: esp32-s3-devkitc-1
  framework:
    type: esp-idf

psram:
  mode: octal
  speed: 80MHz

logger:

api:

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

camera:
  name: Test Camera
  height: 64
  width: 64
  encoder_quality: BEST
  encoder_subsampling: 444
  encoder_buffer_grow: 1024

  image_format: RGB888
  on_capture_image:
    - lambda: |-
        static uint8_t cnt = 0;
        uint8_t *rgb = image.data;
        for (uint16_t x = 0; x < spec.width; ++x) {
          for (uint16_t y = 0; y < spec.height; ++y) {
            int idx = y * spec.bytes_per_row() + x * spec.bytes_per_pixel();
            rgb[idx + 0] = cnt + x;
            rgb[idx + 1] = cnt + y;
            rgb[idx + 2] = cnt + x + y;
          }
        }
        ++cnt;

esp32_camera_web_server:
  - port: 8080
    mode: stream
  - port: 8081
    mode: snapshot

Example entry for config.yaml demonstrating incremental capture, overlays and encoding:

font:
  - file:
      type: gfonts
      family: Roboto
      weight: 700
    id: roboto_20
    size: 20

display:
  platform: camera
  lambda: |-
        int shadow = 1;
        int offset = 2;
        if (context.state == 0) {
          context.x = -shadow;
          context.y = -shadow;
          ++context.state;
        }
        while (context.y <= shadow) {
          while (context.x <= shadow) {
            if ((context.x != 0) && (context.y != 0)) {
              it.print(40 + context.x + offset, 20 + context.y + offset, id(roboto_20), Color(0x000000), "Camera Overlay Rendering:");
              it.print(40 + context.x + offset, 60 + context.y + offset, id(roboto_20), Color(0x000000), "Temperature H: 40°C, 104°F");
              it.print(40 + context.x + offset, 100 + context.y + offset, id(roboto_20), Color(0x000000), "Temperature L: 20°C, 68°F");
            }
            ++context.x;
          }
          context.x = 0;
          ++context.y;
          context.done = false;
          return;
        }

        it.print(40, 20, id(roboto_20), Color(0xFFFFFF), "Camera Overlay Rendering:");
        it.print(40, 60, id(roboto_20), Color(0xFFFFFF), "Temperature H: 40°C, 104°F");
        it.print(40, 100, id(roboto_20), Color(0xFFFFFF), "Temperature L: 20°C, 68°F");

camera:
  name: Test Camera
  width: 512
  height: 256
  encoder_quality: BEST # [BEST|HIGH|MED|LOW]
  encoder_subsampling: 444 #[444|420]
  encoder_mcu_count: 256
  image_format: RGB888
  on_capture_image:
    - lambda: |-
        static uint8_t cnt = 0;
        uint8_t *rgb = image.data;
        int16_t pixel_cnt = 0;
        while (context.y < spec.height) {
          while (context.x < spec.width) {
            int idx = (context.y * spec.width + context.x) * spec.bytes_per_pixel();
            rgb[idx + 0] = cnt + context.x;
            rgb[idx + 1] = cnt + context.y;
            rgb[idx + 2] = cnt + context.x + context.y;
            if (context.x < 10 && context.y < 10) {
              rgb[idx + 0] = 255;
              rgb[idx + 1] = 0;
              rgb[idx + 2] = 0;
            } else if (context.x > 53 && context.y < 10) {
              rgb[idx + 0] = 0;
              rgb[idx + 1] = 255;
              rgb[idx + 2] = 0;
            } else if (context.x < 10 && context.y > 53){
              rgb[idx + 0] = 0;
              rgb[idx + 1] = 0;
              rgb[idx + 2] = 255;
            } else if (context.x > 53 && context.y > 53){
              rgb[idx + 0] = 255;
              rgb[idx + 1] = 255;
              rgb[idx + 2] = 0;
            } else if (context.x >= 27 && context.x < 37 && context.y >= 27 && context.y < 37){
              rgb[idx + 0] = 0;
              rgb[idx + 1] = 0;
              rgb[idx + 2] = 0;
            }
            ++context.x;
            ++pixel_cnt;
          }
          context.x = 0;
          ++context.y;
          // Incremental image capture. Capture only 16384 pixels in one loop()
          if (pixel_cnt >= 16384) {
            context.done = false;
            return;
          }
        }
        cnt = (cnt + 1) % 128;

Screenshot

Skärmbild 2025-05-22 203934

Checklist:

  • [x] The code change is tested and works locally.
  • [x] Tests have been added to verify that the new code works (under tests/ folder).

If user exposed functionality or configuration variables are added/changed:

DT-art1 avatar Oct 20 '24 12:10 DT-art1

Hey there @ottowinter, mind taking a look at this pull request as it has been labeled with an integration (api) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

probot-esphome[bot] avatar Oct 20 '24 12:10 probot-esphome[bot]

Hey there @ayufan, mind taking a look at this pull request as it has been labeled with an integration (esp32_camera_web_server) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

probot-esphome[bot] avatar Oct 20 '24 12:10 probot-esphome[bot]

Hey there @DT-art1, Thanks for submitting this pull request! Can you add yourself as a codeowner for this integration? This way we can notify you if a bug report for this integration is reported. In __init__.py of the integration, please add:

CODEOWNERS = ["@DT-art1"]

And run script/build_codeowners.py

(message by NeedsCodeownersLabel)

probot-esphome[bot] avatar Oct 20 '24 12:10 probot-esphome[bot]

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 72.35%. Comparing base (060bb41) to head (721fef9). :warning: Report is 48 commits behind head on dev.

Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #7639      +/-   ##
==========================================
+ Coverage   72.31%   72.35%   +0.03%     
==========================================
  Files          53       53              
  Lines       11123    11123              
  Branches     1503     1503              
==========================================
+ Hits         8044     8048       +4     
+ Misses       2685     2684       -1     
+ Partials      394      391       -3     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov-commenter avatar Oct 20 '24 12:10 codecov-commenter

Hey there @ayufan, mind taking a look at this pull request as it has been labeled with an integration (esp32_camera_web_server) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

probot-esphome[bot] avatar Oct 20 '24 17:10 probot-esphome[bot]

Please take a look at the requested changes, and use the Ready for review button when you are done, thanks :+1:

Learn more about our pull request process.

esphome[bot] avatar Nov 03 '24 21:11 esphome[bot]

Hey there @ayufan, mind taking a look at this pull request as it has been labeled with an integration (esp32_camera_web_server) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

probot-esphome[bot] avatar Nov 07 '24 17:11 probot-esphome[bot]

Hey there @ayufan, mind taking a look at this pull request as it has been labeled with an integration (esp32_camera_web_server) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

probot-esphome[bot] avatar Nov 08 '24 11:11 probot-esphome[bot]

@jesserockz Hi. I was just wondering if there is anything else to consider or should I consider it as completed?

DT-art1 avatar Nov 22 '24 07:11 DT-art1

any chance to have it merged in next release ?

nliaudat avatar Feb 20 '25 17:02 nliaudat

@jesserockz Hi, it seems like this pull request could benefit others as well. I wanted to check if there are any additional changes you'd like me to make, or if the only thing holding back the merge is time?

DT-art1 avatar Feb 21 '25 12:02 DT-art1

Tested:

  • [x] Compiles without errors
  • [x] Uploaded successfully to Seeed Studio XIAO ESP32S3 Sense
  • [x] Feature works as expected (see attached screenshot)
  • [x] No breaking changes observed.

Test configuration:

esphome:
  name: refactor-esp32camera
  platformio_options:
    build_flags: -DBOARD_HAS_PSRAM
    board_build.arduino.memory_type: qio_opi
    board_build.f_flash: 80000000L
    board_build.flash_mode: qio

esp32:
  board: esp32-s3-devkitc-1
  framework:
    type: arduino

logger:

api:

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

# Configuration XIAO esp32s3 sense
esp32_camera:
  external_clock:
    pin: GPIO10
    frequency: 20MHz
  i2c_pins:
    sda: GPIO40
    scl: GPIO39
  data_pins: [GPIO15, GPIO17, GPIO18, GPIO16, GPIO14, GPIO12, GPIO11, GPIO48]
  vsync_pin: GPIO38
  href_pin: GPIO47
  pixel_clock_pin: GPIO13
  # Automation settings
  on_image:
    then:
    - lambda: |-
        ESP_LOGD("main", "AUTOMATION: on_image len=%d, data=%c", image.length, image.data[0]);
  on_stream_start:
    then:
    - lambda: |-
        ESP_LOGD("main", "AUTOMATION: on_stream_start.");
  on_stream_stop:
    then:
    - lambda: |-
        ESP_LOGD("main", "AUTOMATION: on_stream_stop");

  # Image settings
  name: My Camera
  # ...

esp32_camera_web_server:
  - port: 8080
    mode: stream
  - port: 8081
    mode: snapshot

Note: This PR has been tested with the configuration above. The integration works as intended and does not break previous configurations.

Skärmbild 2025-04-20 123101

DT-art1 avatar Apr 20 '25 10:04 DT-art1

tested with success on esp-idf+arduino frameworks with Freenove ESP32-S3 WROOM N8R8 (8MB Flash / 8MB PSRAM)

  external_clock_pin: GPIO15
  external_clock_frequency: 20MHz
  i2c_pins_sda: GPIO4
  i2c_pins_scl: GPIO5
  data_pins: [GPIO11, GPIO9, GPIO8, GPIO10, GPIO12, GPIO18, GPIO17, GPIO16] 
  vsync_pin: GPIO6
  href_pin: GPIO7
  pixel_clock_pin: GPIO13
  status_led_pin: GPIO02
  flash_led_pin: GPIO48

nliaudat avatar Apr 21 '25 14:04 nliaudat

Hi @kbx81 - I noticed you've been reviewing some graphics-related changes recently.

I'm giving this PR one last pass through before potentially closing it as unmerged. If you feel there's value in it, I'd greatly appreciate your perspective.

Previous feedback has been addressed, and docs are already in place. Thanks in advance - no pressure, of course!

DT-art1 avatar Jun 09 '25 19:06 DT-art1

The PR looks well thought out, and it's definitely worth sticking with. There aren’t many reviewers who can handle something of this size, so it might take a bit of time to get the attention it needs.

bdraco avatar Jun 10 '25 05:06 bdraco

Thanks for the kind words and encouragement! I’m happy to stick with it and available if any questions or clarifications come up. Looking forward to hopefully getting this over the finish line 🙌

DT-art1 avatar Jun 10 '25 14:06 DT-art1

Thank you for this excellent work on modernizing the camera framework! The implementation quality is impressive.

Why This Review Is Taking Time: Camera components have relatively few users in ESPHome, and this PR touches 36+ files with significant architectural changes. This combination means we need to be extra careful to avoid disrupting existing users.

Every ESP32 camera user will need to update their configuration:

# OLD
esp32_camera:
  name: "My Camera"
  jpeg_quality: 10

# NEW  
camera:
  name: "My Camera"
  encoder_quality: HIGH      # New required field
  encoder_subsampling: 420   # New required field

Let's split this into smaller PRs to make review easier and reduce risk:

First PR: Just migrate ESP32 camera to the new platform with minimal breaking changes Later PRs: Add JPEG encoding, overlays, memory optimizations

This approach will:

  • Get your improvements merged faster
  • Give users time to adapt
  • Make testing more manageable

Would you be willing to start with just the core migration? Once that foundation is solid, we can add all the great enhancements you've built.

Let's work together to land it smoothly! Camera users will be much happier in the long run if we can minimize the configuration changes they need to make.

bdraco avatar Jun 29 '25 11:06 bdraco

Thank you for the thoughtful feedback and kind words!

I fully understand the need for a cautious approach given the scope and potential impact on users. Splitting the PR makes sense, and I’m absolutely on board with starting with a minimal core migration that preserves backward compatibility.

I’ll prepare a focused initial PR with just the foundational changes for ESP32 camera and keep the enhancements (encoding, overlays, etc.) for follow-ups. Looking forward to collaborating on this and ensuring a smooth transition for users.

DT-art1 avatar Jun 29 '25 18:06 DT-art1

Hey there @DT-art1, Thanks for submitting this pull request! Can you add yourself as a codeowner for this integration? This way we can notify you if a bug report for this integration is reported. In __init__.py of the integration, please add:

CODEOWNERS = ["@DT-art1"]

And run script/build_codeowners.py

(message by NeedsCodeownersLabel)

probot-esphome[bot] avatar Jul 11 '25 18:07 probot-esphome[bot]

To use the changes from this PR as an external component, add the following to your ESPHome configuration YAML file:

external_components:
  - source: github://pr#7639
    components: [camera, camera_encoder]
    refresh: 1h

(Added by the PR bot)

github-actions[bot] avatar Jul 20 '25 17:07 github-actions[bot]

👋 Hi there! This PR modifies 15 file(s) with codeowners.

@esphome/core - As codeowner(s) of the affected files, your review would be appreciated! 🙏

Note: Automatic review request may have failed, but you're still welcome to review.

github-actions[bot] avatar Jul 20 '25 19:07 github-actions[bot]

Hi, Many thanks for your great work !

Can you open the discussion and PR tab in https://github.com/DT-art1/esphome/tree/refactor_esp32camera ?

I'd like to contribute to implement my needs : [camera_cropper], [camera_decoder]

Perhaps I do not have the skills :(, but I am working on a project to implement general tflite model on esphome and I am hurting on camera image processing. You component is the futur and I am actually reinventing the wheel :)

Best Regards

nliaudat avatar Aug 26 '25 17:08 nliaudat

Thank you very much!

I've set up a discussion thread to get started: Discussion Feel free to post your thoughts - I'll try to help wherever I can. :-)

I also started integrating a processing stage that sits between the capture and the encoder. I think a TFLite model could fit in there as well. Currently I have just a rescaler and a local (uncommited) colorizer placed there, but I think other "image processors" could easily be added too.

Best Regards

DT-art1 avatar Aug 26 '25 19:08 DT-art1

@bdraco I just noticed that you also closed this #7639. I'm not sure if this was intentional, since this PR contains almost the entire new camera pipeline. I was thinking of removing the parts that have already been merged upstream from this PR. But if it was intentional, I can create a new PR with only the relevant parts, with the drawback that it won't include a reference to the full camera pipeline.

DT-art1 avatar Aug 31 '25 14:08 DT-art1

You had closing keywords in 7639 -- ie fixes https://github.com/esphome/esphome/pull/7639 in the PR I merged so it auto closed when I merged that one

bdraco avatar Aug 31 '25 15:08 bdraco

Thanks for reopening and for the clarification. I wasn't aware of this implication.

DT-art1 avatar Aug 31 '25 15:08 DT-art1

Memory Impact Analysis

Components: camera, camera_encoder, camera_pipeline, camera_sensor, esp32_camera, esp32_camera_web_server Platform: esp32-idf

Metric Target Branch This PR Change
RAM 53,980 bytes 54,036 bytes 📈 🔸 +56 bytes (+0.10%)
Flash 872,787 bytes 893,395 bytes 📈 🚨 +20,608 bytes (+2.36%)
📊 Component Memory Breakdown
Component Target Flash PR Flash Change
[esphome]camera 596 bytes 9,391 bytes 📈 🚨 +8,795 bytes (+1475.67%)
[esphome]camera_pipeline 0 bytes 4,072 bytes 📈 🔸 +4,072 bytes (0.00%)
cpp_stdlib 726 bytes 1,903 bytes 📈 +1,177 bytes (+162.12%)
app_framework 6,936 bytes 8,069 bytes 📈 +1,133 bytes (+16.34%)
[esphome]display 0 bytes 845 bytes 📈 🔸 +845 bytes (0.00%)
[esphome]camera_sensor 0 bytes 406 bytes 📈 🔸 +406 bytes (0.00%)
math_lib 4,108 bytes 4,425 bytes 📈 +317 bytes (+7.72%)
[esphome]esp32_camera 3,294 bytes 2,987 bytes 📉 🎉 -307 bytes (-9.32%)
[esphome]core 8,300 bytes 8,589 bytes 📈 🚨 +289 bytes (+3.48%)
cpp_runtime 2,865 bytes 3,097 bytes 📈 +232 bytes (+8.10%)
wifi_config 16,412 bytes 16,603 bytes 📈 +191 bytes (+1.16%)
network_stack 40,830 bytes 41,018 bytes 📈 +188 bytes (+0.46%)
[esphome]camera_encoder 584 bytes 432 bytes 📉 🎉 -152 bytes (-26.03%)
xtensa 3,293 bytes 3,442 bytes 📈 +149 bytes (+4.52%)
memory_alloc 690 bytes 826 bytes 📈 +136 bytes (+19.71%)
interrupt_handlers 19,914 bytes 20,043 bytes 📈 +129 bytes (+0.65%)
http_server 5,642 bytes 5,741 bytes 📈 +99 bytes (+1.75%)
rom_functions 32,315 bytes 32,383 bytes 📈 +68 bytes (+0.21%)
mdns_lib 22,969 bytes 23,033 bytes 📈 +64 bytes (+0.28%)
[esphome]esp32_camera_web_server 1,182 bytes 1,242 bytes 📈 🚨 +60 bytes (+5.08%)
... ... ... (13 more components not shown)
🔍 Symbol-Level Changes (click to expand)

Changed Symbols

Symbol Target Size PR Size Change
setup() 1,245 bytes 2,378 bytes 📈 +1,133 bytes (+91.00%)
esp_netif_new_api 502 bytes 582 bytes 📈 +80 bytes (+15.94%)
mdns_parse_packet 7,655 bytes 7,719 bytes 📈 +64 bytes (+0.84%)
esphome::esp32_camera_web_server::CameraWebServer::setup() 175 bytes 227 bytes 📈 +52 bytes (+29.71%)
esp_intr_alloc_intrstatus_bind 968 bytes 1,015 bytes 📈 +47 bytes (+4.86%)
esp_read_mac 276 bytes 312 bytes 📈 +36 bytes (+13.04%)
esphome::esp32_camera::ESP32Camera::ESP32Camera() 195 bytes 165 bytes 📉 -30 bytes (-15.38%)
esp_netif_start_api 324 bytes 352 bytes 📈 +28 bytes (+8.64%)
ensure_partitions_loaded [$part$0] 383 bytes 411 bytes 📈 +28 bytes (+7.31%)
std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_mutate(unsign...std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_mutate(unsigned int, unsigned int, char const*, unsigned int)
105 bytes 131 bytes 📈 +26 bytes (+24.76%)
httpd_req_new 600 bytes 624 bytes 📈 +24 bytes (+4.00%)
esp_netif_destroy_api 85 bytes 109 bytes 📈 +24 bytes (+28.24%)
netif_callback_fn 215 bytes 235 bytes 📈 +20 bytes (+9.30%)
esp_netif_set_ip_info_api 215 bytes 235 bytes 📈 +20 bytes (+9.30%)
esphome::camera::Camera::Camera() 90 bytes 110 bytes 📈 +20 bytes (+22.22%)
httpd_resp_send 324 bytes 344 bytes 📈 +20 bytes (+6.17%)
get_efuse_mac_custom 123 bytes 139 bytes 📈 +16 bytes (+13.01%)
intr_free_for_current_cpu [$isra$0] 207 bytes 223 bytes 📈 +16 bytes (+7.73%)
esp_netif_ip_lost_timer 137 bytes 149 bytes 📈 +12 bytes (+8.76%)
httpd_register_uri_handler 142 bytes 154 bytes 📈 +12 bytes (+8.45%)
esp_clk_tree_xtal32k_get_freq_hz 72 bytes 84 bytes 📈 +12 bytes (+16.67%)
rtc_isr_register 158 bytes 170 bytes 📈 +12 bytes (+7.59%)
httpd_sess_new 116 bytes 128 bytes 📈 +12 bytes (+10.34%)
periph_rtc_dig_clk8m_enable 80 bytes 92 bytes 📈 +12 bytes (+15.00%)
esp_clk_tree_rc_fast_d256_get_freq_hz 72 bytes 84 bytes 📈 +12 bytes (+16.67%)
esp_netif_init 91 bytes 103 bytes 📈 +12 bytes (+13.19%)
esp_iface_mac_addr_set 140 bytes 152 bytes 📈 +12 bytes (+8.57%)
cb_headers_complete 227 bytes 238 bytes 📈 +11 bytes (+4.85%)
get_efuse_factory_mac 182 bytes 190 bytes 📈 +8 bytes (+4.40%)
esp_netif_set_dns_info_api 74 bytes 82 bytes 📈 +8 bytes (+10.81%)
... ... ... (55 more changed symbols not shown)

New Symbols (top 15)

Symbol Size
esphome::camera::CameraImpl::loop() 1,452 bytes
std::__detail::__prime_list 1,028 bytes
esphome::camera_pipeline::ProcessorBase::run_state_machine_(esphome::camera::ImageFormat, esphome...esphome::camera_pipeline::ProcessorBase::run_state_machine_(esphome::camera::ImageFormat, esphome::camera::CameraImageSpec*, esphome::camera::Buffer*)
519 bytes
esphome::display::Display::draw_pixels_at(int, int, int, int, unsigned char const*, esphome::disp...esphome::display::Display::draw_pixels_at(int, int, int, int, unsigned char const*, esphome::display::ColorOrder, esphome::display::ColorBitness, bool, int, int, int)
369 bytes
esphome::camera::Pipeline::find_unlinked_() 325 bytes
floor 309 bytes
std::__detail::_Insert_base<esphome::camera::Processor*, esphome::camera::Processor*, std::alloca...std::__detail::_Insert_base<esphome::camera::Processor*, esphome::camera::Processor*, std::allocatoresphome::camera::Processor*, std::__detail::_Identity, std::equal_toesphome::camera::Processor*, std::hashesphome::camera::Processor*, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::insert(esphome::camera::Processor* const&) [$isra$0]
308 bytes
std::__detail::_Insert_base<esphome::camera::Output*, esphome::camera::Output*, std::allocator<es...std::__detail::_Insert_base<esphome::camera::Output*, esphome::camera::Output*, std::allocatoresphome::camera::Output*, std::__detail::_Identity, std::equal_toesphome::camera::Output*, std::hashesphome::camera::Output*, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, true, true> >::insert(esphome::camera::Output* const&)
302 bytes
esphome::camera::Pipeline::process() 295 bytes
esphome::camera::Pipeline::configure() 250 bytes
esphome::camera::Pipeline::filter_requesters(esphome::camera::Output*, esphome::camera::Requester...esphome::camera::Pipeline::filter_requesters(esphome::camera::Output*, esphome::camera::RequesterFlags const&)
215 bytes
esphome::display::Display::filled_circle(int, int, int, esphome::Color) 208 bytes
std::_Hashtable<esphome::camera::Processor*, std::pair<esphome::camera::Processor* const, std::ve...std::_Hashtable<esphome::camera::Processor*, std::pair<esphome::camera::Processor* const, std::vector<esphome::camera::Processor*, std::allocatoresphome::camera::Processor* > >, std::allocator<std::pair<esphome::camera::Processor* const, std::vector<esphome::camera::Processor*, std::allocatoresphome::camera::Processor* > > >, std::__detail::_Select1st, std::equal_toesphome::camera::Processor*, std::hashesphome::camera::Processor*, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<esphome::camera::Processor* const, std::vector<esphome::camera::Processor*, std::allocatoresphome::camera::Processor* > >, false>*, unsigned int)
207 bytes
std::_Hashtable<esphome::camera::Processor*, std::pair<esphome::camera::Processor* const, esphome...std::_Hashtable<esphome::camera::Processor*, std::pair<esphome::camera::Processor* const, esphome::camera::Processor*>, std::allocator<std::pair<esphome::camera::Processor* const, esphome::camera::Processor*> >, std::__detail::_Select1st, std::equal_toesphome::camera::Processor*, std::hashesphome::camera::Processor*, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<esphome::camera::Processor* const, esphome::camera::Processor*>, false>*, unsigned int)
203 bytes
vtable for esphome::camera_pipeline::Overlayer 200 bytes
238 more new symbols... Total: 16,214 bytes

Removed Symbols (top 15)

Symbol Size
esphome::camera_encoder::EncoderBufferImpl::set_buffer_size(unsigned int) 74 bytes
std::_Function_handler<void (std::shared_ptresphome::camera::CameraImage), esphome::esp32_camer...std::_Function_handler<void (std::shared_ptresphome::camera::CameraImage), esphome::esp32_camera::ESP32CameraImageTrigger::ESP32CameraImageTrigger(esphome::esp32_camera::ESP32Camera*)::{lambda(std::shared_ptresphome::camera::CameraImage const&)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptresphome::camera::CameraImage&&)
57 bytes
esphome::Actionesphome::esp32_camera::CameraImageData::play_complex(esphome::esp32_camera::Came...esphome::Actionesphome::esp32_camera::CameraImageData::play_complex(esphome::esp32_camera::CameraImageData const&)
49 bytes
esphome::esp32_camera::ESP32Camera::add_image_callback(std::function<void (std::shared_ptr<esphom...esphome::esp32_camera::ESP32Camera::add_image_callback(std::function<void (std::shared_ptresphome::camera::CameraImage)>&&)
40 bytes
esphome::Action<esphome::esp32_camera::CameraImageData>::stop_complex() 37 bytes
std::_Function_handler<void (std::shared_ptresphome::camera::CameraImage), esphome::esp32_camer...std::_Function_handler<void (std::shared_ptresphome::camera::CameraImage), esphome::esp32_camera::ESP32CameraImageTrigger::ESP32CameraImageTrigger(esphome::esp32_camera::ESP32Camera*)::{lambda(std::shared_ptresphome::camera::CameraImage const&)#1}>::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation)
37 bytes
setup()::{lambda(esphome::esp32_camera::CameraImageData)#1}::_FUN(esphome::esp32_camera::CameraIm...setup()::{lambda(esphome::esp32_camera::CameraImageData)#1}::_FUN(esphome::esp32_camera::CameraImageData)
33 bytes
vtable for esphome::camera_encoder::EncoderBufferImpl 32 bytes
esphome::Action<esphome::esp32_camera::CameraImageData>::is_running() 30 bytes
esphome::camera_encoder::ESP32CameraJPEGEncoder::ESP32CameraJPEGEncoder(unsigned char, esphome::c...esphome::camera_encoder::ESP32CameraJPEGEncoder::ESP32CameraJPEGEncoder(unsigned char, esphome::camera::EncoderBuffer*)
30 bytes
vtable for esphome::StatelessLambdaAction<esphome::esp32_camera::CameraImageData> 28 bytes
esphome::camera_encoder::EncoderBufferImpl::~EncoderBufferImpl() 18 bytes
esphome::StatelessLambdaActionesphome::esp32_camera::CameraImageData::play(esphome::esp32_camer...esphome::StatelessLambdaActionesphome::esp32_camera::CameraImageData::play(esphome::esp32_camera::CameraImageData const&)
14 bytes
esphome::camera_encoder::EncoderBufferImpl::get_max_size() const 7 bytes
esphome::camera_encoder::EncoderBufferImpl::get_size() const 7 bytes
4 more removed symbols... Total: 513 bytes

Note: This analysis measures static RAM and Flash usage only (compile-time allocation). Dynamic memory (heap) cannot be measured automatically. ⚠️ You must test this PR on a real device to measure free heap and ensure no runtime memory issues.

This analysis runs automatically when components change. Memory usage is measured from a merged configuration with 6 components.

github-actions[bot] avatar Nov 03 '25 16:11 github-actions[bot]

You should update the example config in your comments to match your test files:

Failed config

camera_pipeline: [source config/tab5.yaml:199]
  - 
    ID 'sensor' conflicts with the name of an esphome integration, please use another ID name.
    id: sensor

lboue avatar Nov 04 '25 08:11 lboue

Interesting, what was your FPS?

youkorr avatar Nov 04 '25 12:11 youkorr

Thanks for asking about FPS!

The FPS mainly depends on which requester is consuming the frames, since each requester has different throughput:

  • Home Assistant / API requester → typically lower throughput with larger resolutions.
  • Camera Web Server → generally higher sustained FPS.
  • Pipeline-only processing (no streaming) → highest FPS.

The pipeline also supports an overlay processor that can draw a live FPS counter on the image, making it easy to measure in different setups.

Note: My current measurements are done over Wi-Fi, not Ethernet. Ethernet will allow higher sustained FPS on networks that support it.

To provide numbers that match your use case — which camera + resolution + requester are you interested in?

DT-art1 avatar Nov 04 '25 12:11 DT-art1