nn-Meter icon indicating copy to clipboard operation
nn-Meter copied to clipboard

Benchmark_model provided seems ineffective on gpu

Open ireneMsm2020 opened this issue 2 years ago • 3 comments

Hi, when I profile the gpu latency on Snapdragon888+ with tf2.7 benchmark_model you provided, the latency seems to be always zero. Is there any idea?

Thank you in advance!

ireneMsm2020 avatar Aug 18 '22 12:08 ireneMsm2020

Hi, it seems there is something wrong in profiling. Maybe you could debug the benchmark model by running command like this:

# push the model to device
adb [-s <device-serial>] push <path-of-your-model> <remote-model-path-to-push>

# run the benchmark model
adb [-s <device-serial>] shell <path-of-your-benchmark-model> --num_threads=1 --num_runs=50 --warmup_runs=10 --graph=<remote-model-path> --enable_op_profiling=true --use_gpu=false

if the benchmark model works well, there will be messages containing latency of each node, and summary message like this:

Timings (microseconds): count=222 first=3897 curr=3924 min=3858 max=4031 avg=3925.67 std=29
Memory (bytes): count=0
133 nodes observed

JiahangXu avatar Aug 19 '22 12:08 JiahangXu

我遇到了相同的问题,同时我进行了测试,可以得到如下的信息:Timings (microseconds): count=100 first=913862 curr=914318 min=877044 max=926036 avg=911992 std=8223 Memory (bytes): count=0 1 nodes observed,但是在正则化匹配时匹配失败 @ @JiahangXu

lorena527 avatar Jun 05 '23 07:06 lorena527

我在dev/profile-in-local上找到了解决方案,谢谢

lorena527 avatar Jun 06 '23 03:06 lorena527