Paddle
Paddle copied to clipboard
Print FLOPs and GFLOPs per second in profiler.
like:
-------------------------> Profiling Report <-------------------------
Place: All
Time unit: ms
Sorted by total time in descending order in the same thread
Event Calls Total Min. Max. Ave. Ratio. FLOPs. GFLOPsPerSec.
thread0::multiclass_nms 1 33.8452 33.8452 33.8452 33.8452 0.578611
thread0::conv2d_fusion 35 7.85149 0.03088 3.00003 0.224328 0.134228
thread0::conv2d_fusion/conv1/ 1 2.92042 2.92042 2.92042 2.92042 0.049927 447897600 153.367733
thread0::softmax 1 2.1207 2.1207 2.1207 2.1207 0.0362552
thread0::softmax/ 1 2.11693 2.11693 2.11693 2.11693 0.0361907
thread0::concat 5 1.17491 0.02384 0.89056 0.234982 0.0200861
thread0::concat/ 5 1.09261 0.020448 0.852576 0.218522 0.0186791
thread0::conv2d_fusion/conv1_add/ 1 0.843776 0.843776 0.843776 0.843776 0.0144251 1791590400 2123.301002
thread0::pool2d 5 0.664608 0.037344 0.360896 0.132922 0.011362
thread0::fusion_transpose_flatten_concat 2 0.640736 0.197152 0.443584 0.320368 0.0109539
thread0::pool2d/ 5 0.609696 0.033952 0.357056 0.121939 0.0104233
thread0::fusion_transpose_flatten_concat/ 2 0.586848 0.193536 0.393312 0.293424 0.0100327
thread0::conv2d_fusion/conv3_2/ 1 0.315008 0.315008 0.315008 0.315008 0.00538533 300810240 954.928847
thread0::conv2d_fusion/conv4_2/ 1 0.267936 0.267936 0.267936 0.267936 0.00458059 79626240 297.183815
thread0::density_prior_box 3 0.23472 0.053088 0.117056 0.07824 0.00401274
thread0::conv2d_fusion/inception_a1_3x3/ 1 0.23152 0.23152 0.23152 0.23152 0.00395803 28200960 121.807880
thread0::density_prior_box/ 3 0.21696 0.04592 0.113536 0.07232 0.00370911
thread0::conv2d_fusion/conv3_2_mbox_loc_context_face/ 1 0.190688 0.190688 0.190688 0.190688 0.00325997 9400320 49.296862
thread0::conv2d_fusion/conv2/ 1 0.186144 0.186144 0.186144 0.186144 0.00318229 338411520 1818.009338
thread0::conv2d_fusion/inception_a3_concat_mbox_loc_context_face/ 1 0.147968 0.147968 0.147968 0.147968 0.00252964 394813440 2668.235398
thread0::conv2d_fusion/inception_a1_3x3_2/ 1 0.126016 0.126016 0.126016 0.126016 0.00215435 28200960 223.788715
thread0::conv2d_fusion/inception_a1_3x3_2_reduce/ 1 0.11968 0.11968 0.11968 0.11968 0.00204603 6266880 52.363635
thread0::conv2d_fusion/conv4_2_mbox_loc_context_face/ 1 0.118752 0.118752 0.118752 0.118752 0.00203017 2488320 20.953920
thread0::conv2d_fusion/conv3_2_mbox_conf_context_face/ 1 0.113312 0.113312 0.113312 0.113312 0.00193716 4700160 41.479808
thread0::conv2d_fusion/conv3_1/ 1 0.103264 0.103264 0.103264 0.103264 0.00176538 66846720 647.338107
thread0::conv2d_fusion/inception_a1_3x3_3/ 1 0.102624 0.102624 0.102624 0.102624 0.00175444 37601280 366.398506
thread0::conv2d_fusion/inception_a2_3x3_2_reduce/ 1 0.09856 0.09856 0.09856 0.09856 0.00168497 12533760 127.168834
thread0::conv2d_fusion/conv4_2_mbox_conf_context_face/ 1 0.09424 0.09424 0.09424 0.09424 0.00161111 1244160 13.202037
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.
:white_check_mark: qingqing01
:white_check_mark: NHZlX
:x: Dang Qingqing
Dang Qingqing seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.
很抱歉,经过我们的反复讨论,你的PR暂未达到合入标准,请阅读飞桨原生算子开发规范,你可以重新提交新的PR,我们先将此PR关闭,感谢你的贡献。 Sorry to inform you that through our discussion, your PR fails to meet the merging standard (Reference: Paddle Custom Operator Design Doc). You can also submit an new one. Thank you.