esp-bsp
esp-bsp copied to clipboard
feature(lvgl_port): Initial support for ppa rendering in lvgl (BSP-563)
Mainly supports the following features:
- color blend (simple fill)
- color blend with opa
- blend normal (blend with argb8888)
- blend normal with color (color convert / memcpy)
This feature can only used in direct/full mode
@peter-marcisovsky @espzav hi, PTAL, this is my first commit for esp-bsp. Therefore, some code related rules are not familiar enough
Could you consider adding some tests to your feature? If you take a look at my #357 , there are functionality and benchmark unit tests. Would it be possible to do the same for this feature?
Also, can you take a look at the failing CI?
yes, can add some test. I have already included the newly submitted lvgl_port component as part of my project and have run the benchmark without any issues. I will also try running the existing test_apps. It is worth noting that because PPA has implemented some of your current functions, it conflicts with the current header file. I have added a new header file myself. The following configuration is required:
CONFIG_LV_DRAW_SW_ASM_CUSTOM=y CONFIG_LV_DRAW_SW_ASM_CUSTOM_INCLUDE="esp_lvgl_port_lv_blend_ppa.h" CONFIG_LV_USE_CUSTOM_MALLOC=y
I am not sure, why the benchmark not printed to comment, I checked in artifacts and it seems, that this optimization is not good.
Benchmark for BOARD esp32_p4_function_ev_board
DATE: 27.03.2025 14:06
LVGL version: 9.2.2
| Name | Avg. CPU | Avg. FPS | Avg. time | render time | flush time |
|---|---|---|---|---|---|
| Empty screen | 88% (+27) | 66 (-21) | 12 (+7) | 5 | 7 (+7) |
| Moving wallpaper | 99% (+7) | 22 (-49) | 40 (+30) | 38 (+31) | 2 (-1) |
| Single rectangle | 97% (+74) | 69 (-20) | 12 (+11) | 0 (-1) | 12 (+12) |
| Multiple rectangles | 97% (+53) | 69 (-29) | 11 (+5) | 7 (+3) | 4 (+2) |
| Multiple RGB images | 98% (+6) | 22 (-29) | 37 (+23) | 27 (+15) | 10 (+8) |
| Multiple ARGB images | 99% (+1) | 16 (-8) | 53 (+18) | 51 (+18) | 2 |
| Rotated ARGB images | 99% | 3 | 280 (+33) | 267 (+22) | 13 (+11) |
| Multiple labels | 98% (-1) | 33 (+1) | 23 (-3) | 18 (-6) | 5 (+3) |
| Screen sized text | 10% (+3) | 89 (+1) | 8 (+4) | 2 (-2) | 6 (+6) |
| Multiple arcs | 90% (-2) | 34 (-5) | 23 (+1) | 19 (-1) | 4 (+2) |
| Containers | 99% (+43) | 22 (-67) | 34 (+29) | 28 (+23) | 6 (+6) |
| Containers with overlay | 99% (+2) | 13 (-4) | 67 (+16) | 57 (+9) | 10 (+7) |
| Containers with opa | 99% (+28) | 14 (-62) | 56 (+47) | 49 (+40) | 7 (+7) |
| Containers with opa_layer | 99% | 8 (-2) | 117 (+34) | 109 (+28) | 8 (+6) |
| Containers with scrolling | 99% | 12 (-6) | 72 (+22) | 68 (+20) | 4 (+2) |
| Widgets demo | 99% | 16 | 51 (-2) | 43 (-8) | 8 (+6) |
| All scenes avg. | 91% (+15) | 31 (-19) | 55 (+17) | 49 (+12) | 6 (+5) |
I am not sure, why the benchmark not printed to comment, I checked in artifacts and it seems, that this optimization is not good.
Benchmark for BOARD esp32_p4_function_ev_board
DATE: 27.03.2025 14:06
LVGL version: 9.2.2
Name Avg. CPU Avg. FPS Avg. time render time flush time Empty screen 88% (+27) 66 (-21) 12 (+7) 5 7 (+7) Moving wallpaper 99% (+7) 22 (-49) 40 (+30) 38 (+31) 2 (-1) Single rectangle 97% (+74) 69 (-20) 12 (+11) 0 (-1) 12 (+12) Multiple rectangles 97% (+53) 69 (-29) 11 (+5) 7 (+3) 4 (+2) Multiple RGB images 98% (+6) 22 (-29) 37 (+23) 27 (+15) 10 (+8) Multiple ARGB images 99% (+1) 16 (-8) 53 (+18) 51 (+18) 2 Rotated ARGB images 99% 3 280 (+33) 267 (+22) 13 (+11) Multiple labels 98% (-1) 33 (+1) 23 (-3) 18 (-6) 5 (+3) Screen sized text 10% (+3) 89 (+1) 8 (+4) 2 (-2) 6 (+6) Multiple arcs 90% (-2) 34 (-5) 23 (+1) 19 (-1) 4 (+2) Containers 99% (+43) 22 (-67) 34 (+29) 28 (+23) 6 (+6) Containers with overlay 99% (+2) 13 (-4) 67 (+16) 57 (+9) 10 (+7) Containers with opa 99% (+28) 14 (-62) 56 (+47) 49 (+40) 7 (+7) Containers with opa_layer 99% 8 (-2) 117 (+34) 109 (+28) 8 (+6) Containers with scrolling 99% 12 (-6) 72 (+22) 68 (+20) 4 (+2) Widgets demo 99% 16 51 (-2) 43 (-8) 8 (+6) All scenes avg. 91% (+15) 31 (-19) 55 (+17) 49 (+12) 6 (+5)
In fact, our PPA/DMA2D acceleration is currently only useful for specific scenarios:
- Color Conversion (RGB565<-->RGB888)
- Memory Copy
- Data filling
- Overlay transparent colors
And the above scenarios all need to have better effects on larger areas, while benchmarks are a comprehensive evaluation, and there are many small areas that need to be drawn separately. If the above scenarios are evaluated separately, the acceleration effect is better, such as flashing the decoded JPEG data stream separately or filling the background color when sliding the menu. In addition, it is recommended to enable tear resistance for all tests(CONFIG_BSP_DISPLAY_LVGL_AVOID_TEAR=y), otherwise it will not have much value in practical applications
Benchmark Summary (9.2.2 ) --------- All scenes avg fps (based on avoid tear and direct mode)
| color | using PPA | not use PPA |
|---|---|---|
| RGB565 | 31fps | 23fps |
| RGB888 | 28fps(*25 fps) | 19fps |
From this table we can see, the avg fps can add about 8-9fps after using PPA/DMA2D
If we use PPA in RGB888, then will dsi underrun, so we need limit DMA2D speed by these code, and avg fps drop from 28 to 25fps, but still higher 6fps then not use ppa
#include "hal/axi_dma_ll.h"
#include "hal/axi_icm_ll.h"
#include "soc/mipi_dsi_bridge_struct.h"
int peak_level = 5; // 4 means 800MB/s, 5 means 400MB/s, 6 means 200MB/s
axi_icm_ll_set_qos_burstiness(AXI_ICM_MASTER_DMA2D, 256, AXI_ICM_ACCESS_WRITE);
axi_icm_ll_set_qos_peak_transaction_rate(AXI_ICM_MASTER_DMA2D, peak_level, peak_level + 1, AXI_ICM_ACCESS_WRITE);
I am not sure, why the benchmark not printed to comment, I checked in artifacts and it seems, that this optimization is not good.
Benchmark for BOARD esp32_p4_function_ev_board
DATE: 27.03.2025 14:06
LVGL version: 9.2.2
Name Avg. CPU Avg. FPS Avg. time render time flush time Empty screen 88% (+27) 66 (-21) 12 (+7) 5 7 (+7) Moving wallpaper 99% (+7) 22 (-49) 40 (+30) 38 (+31) 2 (-1) Single rectangle 97% (+74) 69 (-20) 12 (+11) 0 (-1) 12 (+12) Multiple rectangles 97% (+53) 69 (-29) 11 (+5) 7 (+3) 4 (+2) Multiple RGB images 98% (+6) 22 (-29) 37 (+23) 27 (+15) 10 (+8) Multiple ARGB images 99% (+1) 16 (-8) 53 (+18) 51 (+18) 2 Rotated ARGB images 99% 3 280 (+33) 267 (+22) 13 (+11) Multiple labels 98% (-1) 33 (+1) 23 (-3) 18 (-6) 5 (+3) Screen sized text 10% (+3) 89 (+1) 8 (+4) 2 (-2) 6 (+6) Multiple arcs 90% (-2) 34 (-5) 23 (+1) 19 (-1) 4 (+2) Containers 99% (+43) 22 (-67) 34 (+29) 28 (+23) 6 (+6) Containers with overlay 99% (+2) 13 (-4) 67 (+16) 57 (+9) 10 (+7) Containers with opa 99% (+28) 14 (-62) 56 (+47) 49 (+40) 7 (+7) Containers with opa_layer 99% 8 (-2) 117 (+34) 109 (+28) 8 (+6) Containers with scrolling 99% 12 (-6) 72 (+22) 68 (+20) 4 (+2) Widgets demo 99% 16 51 (-2) 43 (-8) 8 (+6) All scenes avg. 91% (+15) 31 (-19) 55 (+17) 49 (+12) 6 (+5)
The defaults mode maybe not use avoid tear solution? so they can not compare
PTAL again, thanks @espzav @peter-marcisovsky