Paddle icon indicating copy to clipboard operation
Paddle copied to clipboard

Print FLOPs and GFLOPs per second in profiler.

Open qingqing01 opened this issue 6 years ago • 1 comments

like:

------------------------->     Profiling Report     <-------------------------

Place: All
Time unit: ms
Sorted by total time in descending order in the same thread

Event                                                                 Calls       Total       Min.        Max.        Ave.        Ratio.      FLOPs.      GFLOPsPerSec.
thread0::multiclass_nms                                               1           33.8452     33.8452     33.8452     33.8452     0.578611
thread0::conv2d_fusion                                                35          7.85149     0.03088     3.00003     0.224328    0.134228
thread0::conv2d_fusion/conv1/                                         1           2.92042     2.92042     2.92042     2.92042     0.049927    447897600   153.367733
thread0::softmax                                                      1           2.1207      2.1207      2.1207      2.1207      0.0362552
thread0::softmax/                                                     1           2.11693     2.11693     2.11693     2.11693     0.0361907
thread0::concat                                                       5           1.17491     0.02384     0.89056     0.234982    0.0200861
thread0::concat/                                                      5           1.09261     0.020448    0.852576    0.218522    0.0186791
thread0::conv2d_fusion/conv1_add/                                     1           0.843776    0.843776    0.843776    0.843776    0.0144251   1791590400  2123.301002
thread0::pool2d                                                       5           0.664608    0.037344    0.360896    0.132922    0.011362
thread0::fusion_transpose_flatten_concat                              2           0.640736    0.197152    0.443584    0.320368    0.0109539
thread0::pool2d/                                                      5           0.609696    0.033952    0.357056    0.121939    0.0104233
thread0::fusion_transpose_flatten_concat/                             2           0.586848    0.193536    0.393312    0.293424    0.0100327
thread0::conv2d_fusion/conv3_2/                                       1           0.315008    0.315008    0.315008    0.315008    0.00538533  300810240   954.928847
thread0::conv2d_fusion/conv4_2/                                       1           0.267936    0.267936    0.267936    0.267936    0.00458059  79626240    297.183815
thread0::density_prior_box                                            3           0.23472     0.053088    0.117056    0.07824     0.00401274
thread0::conv2d_fusion/inception_a1_3x3/                              1           0.23152     0.23152     0.23152     0.23152     0.00395803  28200960    121.807880
thread0::density_prior_box/                                           3           0.21696     0.04592     0.113536    0.07232     0.00370911
thread0::conv2d_fusion/conv3_2_mbox_loc_context_face/                 1           0.190688    0.190688    0.190688    0.190688    0.00325997  9400320     49.296862
thread0::conv2d_fusion/conv2/                                         1           0.186144    0.186144    0.186144    0.186144    0.00318229  338411520   1818.009338
thread0::conv2d_fusion/inception_a3_concat_mbox_loc_context_face/     1           0.147968    0.147968    0.147968    0.147968    0.00252964  394813440   2668.235398
thread0::conv2d_fusion/inception_a1_3x3_2/                            1           0.126016    0.126016    0.126016    0.126016    0.00215435  28200960    223.788715
thread0::conv2d_fusion/inception_a1_3x3_2_reduce/                     1           0.11968     0.11968     0.11968     0.11968     0.00204603  6266880     52.363635
thread0::conv2d_fusion/conv4_2_mbox_loc_context_face/                 1           0.118752    0.118752    0.118752    0.118752    0.00203017  2488320     20.953920
thread0::conv2d_fusion/conv3_2_mbox_conf_context_face/                1           0.113312    0.113312    0.113312    0.113312    0.00193716  4700160     41.479808
thread0::conv2d_fusion/conv3_1/                                       1           0.103264    0.103264    0.103264    0.103264    0.00176538  66846720    647.338107
thread0::conv2d_fusion/inception_a1_3x3_3/                            1           0.102624    0.102624    0.102624    0.102624    0.00175444  37601280    366.398506
thread0::conv2d_fusion/inception_a2_3x3_2_reduce/                     1           0.09856     0.09856     0.09856     0.09856     0.00168497  12533760    127.168834
thread0::conv2d_fusion/conv4_2_mbox_conf_context_face/                1           0.09424     0.09424     0.09424     0.09424     0.00161111  1244160     13.202037

qingqing01 avatar Dec 18 '18 09:12 qingqing01

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

:white_check_mark: qingqing01
:white_check_mark: NHZlX
:x: Dang Qingqing


Dang Qingqing seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Dec 28 '18 05:12 CLAassistant

很抱歉,经过我们的反复讨论,你的PR暂未达到合入标准,请阅读飞桨原生算子开发规范,你可以重新提交新的PR,我们先将此PR关闭,感谢你的贡献。 Sorry to inform you that through our discussion, your PR fails to meet the merging standard (Reference: Paddle Custom Operator Design Doc). You can also submit an new one. Thank you.

paddle-bot[bot] avatar Jan 11 '23 11:01 paddle-bot[bot]