iree icon indicating copy to clipboard operation
iree copied to clipboard

Reland Pixel8 benchmark migration

Open pzread opened this issue 1 year ago • 1 comments

We have cooling plate for Pixel 8 and expect to get more stable performance now. Reland #16087

pzread avatar Feb 13 '24 20:02 pzread

Abbreviated Benchmark Summary

@ commit 74ac1039bc4cb8acf4b21553be34fcb542ffaa2b (no previous benchmark results to compare)

Data-Tiling Comparison Table

Click to show
Name No-DT (baseline) DT-Only DT-UK
DeepLabV3_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] 50.181 (1.0X) N/A 87.688 (0.6X)
DeepLabV3_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 51.919 (1.0X) N/A 89.274 (0.6X)
DeepLabV3_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 19.879 (1.0X) N/A 32.144 (0.6X)
GPT2_117M_TF_1X1XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] 50.490 (1.0X) N/A 57.681 (0.9X)
GPT2_117M_TF_1X1XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 57.142 (1.0X) N/A 73.731 (0.8X)
GPT2_117M_TF_1X1XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 27.507 (1.0X) N/A 19.812 (1.4X)
GPT2_117M_TF_1X4XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] 77.218 (1.0X) N/A 52.731 (1.5X)
GPT2_117M_TF_1X4XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 82.488 (1.0X) N/A 58.862 (1.4X)
GPT2_117M_TF_1X4XI32(stablehlo) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 30.651 (1.0X) N/A 22.033 (1.4X)
MobileBertSquad_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] 691.665 (1.0X) N/A 337.221 (2.1X)
MobileBertSquad_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 703.626 (1.0X) N/A 348.609 (2.0X)
MobileBertSquad_fp32(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 269.177 (1.0X) N/A 166.383 (1.6X)
MobileBertSquad_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] 1053.220 (1.0X) N/A 316.718 (3.3X)
MobileBertSquad_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 1053.514 (1.0X) N/A 322.727 (3.3X)
MobileBertSquad_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 366.597 (1.0X) N/A 129.367 (2.8X)
Vit_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_sync(embedded_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] 2651.795 (1.0X) N/A 812.961 (3.3X)
Vit_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 2658.622 (1.0X) N/A 830.846 (3.2X)
Vit_int8(tflite) [armv9-a-generic-linux_android34-llvm_cpu] local_task(embedded_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 845.223 (1.0X) N/A 268.275 (3.2X)

Raw Latencies

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
DeepLabV3\_fp32(tflite) [armv9-a-generic-linux\_android34-llvm\_cpu][default-flags,dt-uk] local\_sync(embedded\_elf)[full-inference,default-flags] with default @ pixel-8-pro[big-cores] 87.688 87.656 0.363
DeepLabV3\_fp32(tflite) [armv9-a-generic-linux\_android34-llvm\_cpu][default-flags,dt-uk] local\_task(embedded\_elf)[1-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 89.274 89.351 0.581
DeepLabV3\_fp32(tflite) [armv9-a-generic-linux\_android34-llvm\_cpu][default-flags,dt-uk] local\_task(embedded\_elf)[5-thread,full-inference,system-scheduling] with default @ pixel-8-pro[big-cores] 32.144 32.090 0.541

[Top 3 out of 47 results showed]

No improved or regressed compilation metrics 🏖️

For more information:

Source Workflow Run

github-actions[bot] avatar Feb 13 '24 21:02 github-actions[bot]