iree
iree copied to clipboard
Reland Pixel8 benchmark migration
We have cooling plate for Pixel 8 and expect to get more stable performance now. Reland #16087
Abbreviated Benchmark Summary
@ commit 74ac1039bc4cb8acf4b21553be34fcb542ffaa2b (no previous benchmark results to compare)
Data-Tiling Comparison Table
Click to show
| Name | No-DT (baseline) | DT-Only | DT-UK |
|---|---|---|---|
| DeepLabV3_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] | 50.181 (1.0X) | N/A | 87.688 (0.6X) |
| DeepLabV3_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 51.919 (1.0X) | N/A | 89.274 (0.6X) |
| DeepLabV3_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 19.879 (1.0X) | N/A | 32.144 (0.6X) |
| GPT2_117M_TF_1X1XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] | 50.490 (1.0X) | N/A | 57.681 (0.9X) |
| GPT2_117M_TF_1X1XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 57.142 (1.0X) | N/A | 73.731 (0.8X) |
| GPT2_117M_TF_1X1XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 27.507 (1.0X) | N/A | 19.812 (1.4X) |
| GPT2_117M_TF_1X4XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] | 77.218 (1.0X) | N/A | 52.731 (1.5X) |
| GPT2_117M_TF_1X4XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 82.488 (1.0X) | N/A | 58.862 (1.4X) |
| GPT2_117M_TF_1X4XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 30.651 (1.0X) | N/A | 22.033 (1.4X) |
| MobileBertSquad_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] | 691.665 (1.0X) | N/A | 337.221 (2.1X) |
| MobileBertSquad_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 703.626 (1.0X) | N/A | 348.609 (2.0X) |
| MobileBertSquad_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 269.177 (1.0X) | N/A | 166.383 (1.6X) |
| MobileBertSquad_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] | 1053.220 (1.0X) | N/A | 316.718 (3.3X) |
| MobileBertSquad_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 1053.514 (1.0X) | N/A | 322.727 (3.3X) |
| MobileBertSquad_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 366.597 (1.0X) | N/A | 129.367 (2.8X) |
| Vit_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] | 2651.795 (1.0X) | N/A | 812.961 (3.3X) |
| Vit_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 2658.622 (1.0X) | N/A | 830.846 (3.2X) |
| Vit_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 845.223 (1.0X) | N/A | 268.275 (3.2X) |
Raw Latencies
| Benchmark Name | Average Latency (ms) | Median Latency (ms) | Latency Standard Deviation (ms) |
|---|---|---|---|
| DeepLabV3\_fp32(tflite) [armv9-a-generic-linux\_android34-llvm\_cpu][default-flags,dt-uk] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] | 87.688 | 87.656 | 0.363 |
| DeepLabV3\_fp32(tflite) [armv9-a-generic-linux\_android34-llvm\_cpu][default-flags,dt-uk] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 89.274 | 89.351 | 0.581 |
| DeepLabV3\_fp32(tflite) [armv9-a-generic-linux\_android34-llvm\_cpu][default-flags,dt-uk] local\_task(embedded\_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] | 32.144 | 32.090 | 0.541 |
[Top 3 out of 47 results showed]
No improved or regressed compilation metrics 🏖️
For more information: