Why does the benchmark program run with only one thread?
After building, I tried running some benchmark program, such as, f16-gemm-ben. Why does it run with only one thread? Is there any options to make it run with multiple threads?
You can use models/benchmark with --num-threads=2 etc
bazel build -c opt :bench/models:benchmark
models/benchmark --benchmark_filter=V2
Benchmark Time CPU Iterations UserCounters...
FP32MobileNetV2/real_time 5327 us 5326 us 127 cpufreq=3.32637G FP16MobileNetV2/real_time 16901 us 16901 us 40 cpufreq=3.45263G QS8MobileNetV2/real_time 7883 us 7881 us 83 cpufreq=3.28256G
models/benchmark --benchmark_filter=V2 --num_threads=2
Benchmark Time CPU Iterations UserCounters...
FP32MobileNetV2/real_time 3226 us 3226 us 217 cpufreq=3.35078G FP16MobileNetV2/real_time 9327 us 9315 us 71 cpufreq=3.48561G QS8MobileNetV2/real_time 4827 us 4827 us 136 cpufreq=3.27259G
If you run perf record/report or watch top, you'll see multiple threads run
There is also a TFLite benchmark_model in the tflite github, which will give similar result, and can work with a variety of tflite models, as it runs on top of xnnpack tensorflow/lite/tools/benchmark:benchmark_model
benchmark_model --graph=mobilenet_v2_1.00_224_int8.tflite --num_threads=1 --num_runs=5 INFO: Inference timings in us: Init: 18194, First inference: 9283, Warmup (avg): 8374.68, Inference (avg): 7927.9
benchmark_model --graph=mobilenet_v2_1.00_224_int8.tflite --num_threads=2 --num_runs=5 INFO: Inference timings in us: Init: 17629, First inference: 7933, Warmup (avg): 4846.4, Inference (avg): 4670.13