nn-Meter
nn-Meter copied to clipboard
Benchmark_model provided seems ineffective on gpu
Hi, when I profile the gpu latency on Snapdragon888+ with tf2.7 benchmark_model you provided, the latency seems to be always zero. Is there any idea?
Thank you in advance!
Hi, it seems there is something wrong in profiling. Maybe you could debug the benchmark model by running command like this:
# push the model to device
adb [-s <device-serial>] push <path-of-your-model> <remote-model-path-to-push>
# run the benchmark model
adb [-s <device-serial>] shell <path-of-your-benchmark-model> --num_threads=1 --num_runs=50 --warmup_runs=10 --graph=<remote-model-path> --enable_op_profiling=true --use_gpu=false
if the benchmark model works well, there will be messages containing latency of each node, and summary message like this:
Timings (microseconds): count=222 first=3897 curr=3924 min=3858 max=4031 avg=3925.67 std=29
Memory (bytes): count=0
133 nodes observed
我遇到了相同的问题,同时我进行了测试,可以得到如下的信息:Timings (microseconds): count=100 first=913862 curr=914318 min=877044 max=926036 avg=911992 std=8223 Memory (bytes): count=0 1 nodes observed,但是在正则化匹配时匹配失败 @ @JiahangXu
我在dev/profile-in-local上找到了解决方案,谢谢