embedded-ai.bench
embedded-ai.bench copied to clipboard
[WONT FIX] tensorflow lite没有设置绑定大核,taskset
Reducing variance between runs on Android.
Most modern Android phones use ARM big.LITTLE architecture where some cores are more power hungry but faster than other cores. When running benchmarks on these phones there can be significant variance between different runs of the benchmark. One way to reduce variance between runs is to set the CPU affinity before running the benchmark. On Android this can be done using the taskset command. E.g. for running the benchmark on big cores on Pixel 2 with a single thread one can use the following command:
adb shell taskset f0 /data/local/tmp/benchmark_model \
--graph=/data/local/tmp/mobilenet_quant_v1_224.tflite \
--num_threads=1
where f0 is the affinity mask for big cores on Pixel 2. Note: The affinity mask varies with the device.
ref: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#reducing-variance-between-runs-on-android
The CPU affinity is represented as a bitmask, with the lowest order bit corresponding to the first logical CPU and the highest order bit corresponding to the last logical CPU. Not all CPUs may exist on a given system but a mask may specify more CPUs than are present. A retrieved mask will reflect only the bits that correspond to CPUs physically on the system. If an invalid mask is given (i.e., one that corresponds to no valid CPUs on the current system) an error is returned. The masks are typically given in hexadecimal. For example,
0x00000001
is processor #0
0x00000003
is processors #0 and #1
0xFFFFFFFF
is all processors (#0 through #31)
ref: https://linux.die.net/man/1/taskset
根据cpu_idx_str计算cpu_mask_str
def cpu_idx_str_to_mask(cpu_idx_str_raw):
str_cpu_idx_list = cpu_idx_str_raw.split(",")
str_cpu_idx_list = filter(lambda s: s != '', str_cpu_idx_list)
int_cpu_idx_list = map(int, str_cpu_idx_list)
cpu_mask_10 = 0
for cpu_idx in int_cpu_idx_list:
cpu_mask_10 += 2 ** cpu_idx
cpu_mask_hex_raw = hex(cpu_mask_10)
cpu_mask_hex_std_list = cpu_mask_hex_raw.split("x")
assert len(cpu_mask_hex_std_list) == 2
zero_num = 8 - len(cpu_mask_hex_std_list[1])
cpu_mask_hex_std = "0x" + zero_num * "0" + cpu_mask_hex_std_list[1]
print("cpu_idx_str_raw:{}, cpu_mask_10:{}, cpu_mask_hex_raw:{}, cpu_mask_hex_std:{}".format(cpu_idx_str_raw, cpu_mask_10, cpu_mask_hex_raw, cpu_mask_hex_std))
return cpu_mask_hex_std
print(cpu_idx_str_to_mask("0,1,2"))
print(cpu_idx_str_to_mask("0"))
print(cpu_idx_str_to_mask("1"))
print(cpu_idx_str_to_mask("0,1"))
print(cpu_idx_str_to_mask("1,3"))
print(cpu_idx_str_to_mask("1,5,7"))
print(cpu_idx_str_to_mask("0,1,2"))
taskset方式绑定cpu的方法并不通用,即不是所有安卓手机上都支持taskset cpu_mask方式。