Implement NLLLoss
- Added NLLLoss forward and backward operation and kernel.
- Added driver test and gtest for NLLLoss.
- New API is guarded by MIOPEN_BETA_API macro.
Nllloss float16
| op_name | dtype | size | ignore_index | contiguous | reduction | direction | rocm_kernel_avg | miopen_kernel_time | improvement over rocm |
|---|---|---|---|---|---|---|---|---|---|
| NLLLoss | float16 | [16_21_512_512] | 255 | contiguous | none | fwd | 890197 | 194222 | 4.583399409 |
| NLLLoss | float16 | [16_21_512_512] | 255 | contiguous | none | bwd | 1442306 | 359644 | 4.010371367 |
| NLLLoss | float16 | [16_21_512_512] | 255 | noncontiguous | none | fwd | 884532 | 500443 | 1.767497997 |
| NLLLoss | float16 | [16_21_512_512] | 255 | noncontiguous | none | bwd | 1442770 | 681580 | 2.116802136 |
| NLLLoss | float16 | [64_21_254_333] | 255 | contiguous | none | fwd | 1305054 | 225706 | 5.782097064 |
| NLLLoss | float16 | [64_21_254_333] | 255 | contiguous | none | bwd | 1819114 | 417190 | 4.360396941 |
| NLLLoss | float16 | [64_21_254_333] | 255 | noncontiguous | none | fwd | 1290158 | 377457 | 3.418026424 |
| NLLLoss | float16 | [64_21_254_333] | 255 | noncontiguous | none | bwd | 1828042 | 581190 | 3.145343175 |
| NLLLoss | float16 | [64_21_213_331] | 255 | contiguous | none | fwd | 1082360 | 184088 | 5.879579332 |
| NLLLoss | float16 | [64_21_213_331] | 255 | contiguous | none | bwd | 1517602 | 357759 | 4.241967358 |
| NLLLoss | float16 | [64_21_213_331] | 255 | noncontiguous | none | fwd | 1111929 | 277368 | 4.008858268 |
| NLLLoss | float16 | [64_21_213_331] | 255 | noncontiguous | none | bwd | 1510706 | 416906 | 3.623612997 |
| NLLLoss | float16 | [64_21_240_332] | 255 | contiguous | none | fwd | 1379231 | 209689 | 6.577507642 |
| NLLLoss | float16 | [64_21_240_332] | 255 | contiguous | none | bwd | 1801640 | 393102 | 4.583136183 |
| NLLLoss | float16 | [64_21_240_332] | 255 | noncontiguous | none | fwd | 1346462 | 350293 | 3.843816462 |
| NLLLoss | float16 | [64_21_240_332] | 255 | noncontiguous | none | bwd | 1808008 | 516301 | 3.501848728 |
| NLLLoss | float16 | [64_21_212_320] | 255 | contiguous | none | fwd | 1126393 | 192375 | 5.855194282 |
| NLLLoss | float16 | [64_21_212_320] | 255 | contiguous | none | bwd | 1501361 | 333056 | 4.507833517 |
| NLLLoss | float16 | [64_21_212_320] | 255 | noncontiguous | none | fwd | 1133625 | 264618 | 4.284005623 |
| NLLLoss | float16 | [64_21_212_320] | 255 | noncontiguous | none | bwd | 1504929 | 389283 | 3.865899615 |
| NLLLoss | float16 | [64_21_218_333] | 255 | contiguous | none | fwd | 1141113 | 190510 | 5.989780064 |
| NLLLoss | float16 | [64_21_218_333] | 255 | contiguous | none | bwd | 1594611 | 355903 | 4.480465183 |
| NLLLoss | float16 | [64_21_218_333] | 255 | noncontiguous | none | fwd | 1144889 | 297401 | 3.849647446 |
| NLLLoss | float16 | [64_21_218_333] | 255 | noncontiguous | none | bwd | 1594307 | 448163 | 3.557426651 |
| NLLLoss | float16 | [64_21_270_333] | 255 | contiguous | none | fwd | 1498160 | 235967 | 6.34902338 |
| NLLLoss | float16 | [64_21_270_333] | 255 | contiguous | none | bwd | 2033964 | 445394 | 4.566662326 |
| NLLLoss | float16 | [64_21_270_333] | 255 | noncontiguous | none | fwd | 1497568 | 427903 | 3.49978383 |
| NLLLoss | float16 | [64_21_270_333] | 255 | noncontiguous | none | bwd | 2029836 | 656137 | 3.09361612 |
| NLLLoss | float16 | [64_21_237_329] | 255 | contiguous | none | fwd | 1255186 | 204540 | 6.136628532 |
| NLLLoss | float16 | [64_21_237_329] | 255 | contiguous | none | bwd | 1724664 | 383268 | 4.499890416 |
| NLLLoss | float16 | [64_21_237_329] | 255 | noncontiguous | none | fwd | 1255423 | 341315 | 3.67819463 |
| NLLLoss | float16 | [64_21_237_329] | 255 | noncontiguous | none | bwd | 1727381 | 516541 | 3.344131444 |
| NLLLoss | float16 | [64_21_225_246] | 255 | contiguous | none | fwd | 833459 | 145753 | 5.718297394 |
| NLLLoss | float16 | [64_21_225_246] | 255 | contiguous | none | bwd | 1186100 | 269712 | 4.397653794 |
| NLLLoss | float16 | [64_21_225_246] | 255 | noncontiguous | none | fwd | 821154 | 191333 | 4.291753122 |
| NLLLoss | float16 | [64_21_225_246] | 255 | noncontiguous | none | bwd | 1185523 | 280467 | 4.226960748 |
| NLLLoss | float16 | [64_21_240_292] | 255 | contiguous | none | fwd | 1126461 | 185379 | 6.076529704 |
| NLLLoss | float16 | [64_21_240_292] | 255 | contiguous | none | bwd | 1543947 | 346670 | 4.453650446 |
| NLLLoss | float16 | [64_21_240_292] | 255 | noncontiguous | none | fwd | 1137307 | 281144 | 4.045282844 |
| NLLLoss | float16 | [64_21_240_292] | 255 | noncontiguous | none | bwd | 1543113 | 411521 | 3.749779477 |
| NLLLoss | float16 | [64_21_288_303] | 255 | contiguous | none | fwd | 1579586 | 229698 | 6.876794748 |
| NLLLoss | float16 | [64_21_288_303] | 255 | contiguous | none | bwd | 2035438 | 435930 | 4.66918542 |
| NLLLoss | float16 | [64_21_288_303] | 255 | noncontiguous | none | fwd | 1575681 | 379915 | 4.147456668 |
| NLLLoss | float16 | [64_21_288_303] | 255 | noncontiguous | none | bwd | 2045004 | 575659 | 3.552457271 |
| NLLLoss | float16 | [64_21_274_275] | 255 | contiguous | none | fwd | 1291295 | 197255 | 6.546323287 |
| NLLLoss | float16 | [64_21_274_275] | 255 | contiguous | none | bwd | 1718361 | 371632 | 4.623824106 |
| NLLLoss | float16 | [64_21_274_275] | 255 | noncontiguous | none | fwd | 1284766 | 291333 | 4.409956991 |
| NLLLoss | float16 | [64_21_274_275] | 255 | noncontiguous | none | bwd | 1716184 | 426564 | 4.02327435 |
| NLLLoss | float16 | [64_21_273_322] | 255 | contiguous | none | fwd | 1480952 | 232562 | 6.367987891 |
| NLLLoss | float16 | [64_21_273_322] | 255 | contiguous | none | bwd | 1981424 | 442885 | 4.473901803 |
| NLLLoss | float16 | [64_21_273_322] | 255 | noncontiguous | none | fwd | 1471384 | 413926 | 3.554703015 |
| NLLLoss | float16 | [64_21_273_322] | 255 | noncontiguous | none | bwd | 1996239 | 624818 | 3.194912759 |
| NLLLoss | float16 | [64_21_240_320] | 255 | contiguous | none | fwd | 1309257 | 206821 | 6.330387146 |
| NLLLoss | float16 | [64_21_240_320] | 255 | contiguous | none | bwd | 1718289 | 383173 | 4.484368679 |
| NLLLoss | float16 | [64_21_240_320] | 255 | noncontiguous | none | fwd | 1312616 | 333023 | 3.941517553 |
| NLLLoss | float16 | [64_21_240_320] | 255 | noncontiguous | none | bwd | 1715729 | 492006 | 3.487211538 |
| NLLLoss | float16 | [64_21_238_269] | 255 | contiguous | none | fwd | 1023852 | 165365 | 6.19146736 |
| NLLLoss | float16 | [64_21_238_269] | 255 | contiguous | none | bwd | 1409989 | 311922 | 4.520325594 |
| NLLLoss | float16 | [64_21_238_269] | 255 | noncontiguous | none | fwd | 1012924 | 246679 | 4.106243336 |
| NLLLoss | float16 | [64_21_238_269] | 255 | noncontiguous | none | bwd | 1402437 | 370783 | 3.782365966 |
| NLLLoss | float16 | [64_21_213_326] | 255 | contiguous | none | fwd | 1066155 | 182556 | 5.840153158 |
| NLLLoss | float16 | [64_21_213_326] | 255 | contiguous | none | bwd | 1502498 | 355761 | 4.223335329 |
| NLLLoss | float16 | [64_21_213_326] | 255 | noncontiguous | none | fwd | 1069627 | 266537 | 4.013052597 |
| NLLLoss | float16 | [64_21_213_326] | 255 | noncontiguous | none | bwd | 1506722 | 391210 | 3.851440403 |
| NLLLoss | float16 | [64_21_297_333] | 255 | contiguous | none | fwd | 1792364 | 267053 | 6.711641509 |
| NLLLoss | float16 | [64_21_297_333] | 255 | contiguous | none | bwd | 2313857 | 511777 | 4.521221157 |
| NLLLoss | float16 | [64_21_297_333] | 255 | noncontiguous | none | fwd | 1787788 | 499119 | 3.581887285 |
| NLLLoss | float16 | [64_21_297_333] | 255 | noncontiguous | none | bwd | 2314881 | 762438 | 3.036156383 |
| NLLLoss | float16 | [64_21_212_303] | 255 | contiguous | none | fwd | 1016907 | 168779 | 6.025080134 |
| NLLLoss | float16 | [64_21_212_303] | 255 | contiguous | none | bwd | 1383940 | 312332 | 4.4309901 |
| NLLLoss | float16 | [64_21_212_303] | 255 | noncontiguous | none | fwd | 1006076 | 242840 | 4.142958326 |
| NLLLoss | float16 | [64_21_212_303] | 255 | noncontiguous | none | bwd | 1383028 | 363976 | 3.799778007 |
| NLLLoss | float16 | [64_21_230_335] | 255 | contiguous | none | fwd | 1221639 | 202734 | 6.025822013 |
| NLLLoss | float16 | [64_21_230_335] | 255 | contiguous | none | bwd | 1686190 | 379371 | 4.444699252 |
| NLLLoss | float16 | [64_21_230_335] | 255 | noncontiguous | none | fwd | 1249142 | 315906 | 3.954157249 |
| NLLLoss | float16 | [64_21_230_335] | 255 | noncontiguous | none | bwd | 1684766 | 479459 | 3.513889613 |
| NLLLoss | float16 | [64_21_198_257] | 255 | contiguous | none | fwd | 703762 | 135767 | 5.183601317 |
| NLLLoss | float16 | [64_21_198_257] | 255 | contiguous | none | bwd | 1074762 | 248974 | 4.316763999 |
| NLLLoss | float16 | [64_21_198_257] | 255 | noncontiguous | none | fwd | 702306 | 165082 | 4.254285749 |
| NLLLoss | float16 | [64_21_198_257] | 255 | noncontiguous | none | bwd | 1080826 | 242378 | 4.459257853 |
| NLLLoss | float16 | [64_21_283_320] | 255 | contiguous | none | fwd | 1647087 | 239801 | 6.868557679 |
| NLLLoss | float16 | [64_21_283_320] | 255 | contiguous | none | bwd | 2113253 | 451584 | 4.679645426 |
| NLLLoss | float16 | [64_21_283_320] | 255 | noncontiguous | none | fwd | 1629423 | 434784 | 3.747660907 |
| NLLLoss | float16 | [64_21_283_320] | 255 | noncontiguous | none | bwd | 3439003 | 647705 | 5.309520538 |
| NLLLoss | float16 | [64_21_175_333] | 255 | contiguous | none | fwd | 812464 | 150825 | 5.386799271 |
| NLLLoss | float16 | [64_21_175_333] | 255 | contiguous | none | bwd | 1177497 | 293969 | 4.005514187 |
| NLLLoss | float16 | [64_21_175_333] | 255 | noncontiguous | none | fwd | 786705 | 211588 | 3.71809838 |
| NLLLoss | float16 | [64_21_175_333] | 255 | noncontiguous | none | bwd | 1180153 | 307035 | 3.843708372 |
| NLLLoss | float16 | [64_21_267_326] | 255 | contiguous | none | fwd | 1471443 | 233277 | 6.307707146 |
| NLLLoss | float16 | [64_21_267_326] | 255 | contiguous | none | bwd | 1980842 | 439140 | 4.510730063 |
| NLLLoss | float16 | [64_21_267_326] | 255 | noncontiguous | none | fwd | 1470628 | 404048 | 3.639735873 |
| NLLLoss | float16 | [64_21_267_326] | 255 | noncontiguous | none | bwd | 1980538 | 605503 | 3.270897089 |
| NLLLoss | float16 | [32_21_256_256] | 255 | contiguous | none | fwd | 377577 | 90399 | 4.176782929 |
| NLLLoss | float16 | [32_21_256_256] | 255 | contiguous | none | bwd | 715747 | 154416 | 4.635186768 |
| NLLLoss | float16 | [32_21_256_256] | 255 | noncontiguous | none | fwd | 375193 | 130788 | 2.868711197 |
| NLLLoss | float16 | [32_21_256_256] | 255 | noncontiguous | none | bwd | 720051 | 178166 | 4.041461334 |
| NLLLoss | float16 | [55_21_112_257] | 255 | contiguous | none | fwd | 241339 | 71466 | 3.376976464 |
| NLLLoss | float16 | [55_21_112_257] | 255 | contiguous | none | bwd | 455927 | 122843 | 3.711460971 |
| NLLLoss | float16 | [55_21_112_257] | 255 | noncontiguous | none | fwd | 247467 | 95128 | 2.601410731 |
| NLLLoss | float16 | [55_21_112_257] | 255 | noncontiguous | none | bwd | 456295 | 158238 | 2.883599388 |
| NLLLoss | float16 | [24_21_512_512] | 255 | contiguous | none | fwd | 1547971 | 294289 | 5.260036903 |
| NLLLoss | float16 | [24_21_512_512] | 255 | contiguous | none | bwd | 2458274 | 543548 | 4.522643814 |
| NLLLoss | float16 | [24_21_512_512] | 255 | noncontiguous | none | fwd | 1549924 | 692773 | 2.237275413 |
| NLLLoss | float16 | [24_21_512_512] | 255 | noncontiguous | none | bwd | 2461298 | 1170136 | 2.103429003 |
| NLLLoss | float16 | [16_21_512_512] | 255 | noncontiguous | sum | fwd | 515007 | 544227 | 0.946309169 |
| NLLLoss | float16 | [16_21_512_512] | 255 | noncontiguous | sum | bwd | 1275629 | 679031 | 1.878602008 |
| NLLLoss | float16 | [64_21_254_333] | 255 | noncontiguous | sum | fwd | 652399 | 438811 | 1.486742584 |
| NLLLoss | float16 | [64_21_254_333] | 255 | noncontiguous | sum | bwd | 776494 | 529771 | 1.465716319 |
| NLLLoss | float16 | [64_21_213_331] | 255 | noncontiguous | sum | fwd | 545551 | 331494 | 1.645734161 |
| NLLLoss | float16 | [64_21_213_331] | 255 | noncontiguous | sum | bwd | 643598 | 381899 | 1.685257097 |
| NLLLoss | float16 | [64_21_240_332] | 255 | noncontiguous | sum | fwd | 655326 | 404637 | 1.619540477 |
| NLLLoss | float16 | [64_21_240_332] | 255 | noncontiguous | sum | bwd | 710942 | 479328 | 1.483205655 |
| NLLLoss | float16 | [64_21_212_320] | 255 | noncontiguous | sum | fwd | 535919 | 316681 | 1.692299191 |
| NLLLoss | float16 | [64_21_212_320] | 255 | noncontiguous | sum | bwd | 588047 | 362688 | 1.621357751 |
| NLLLoss | float16 | [64_21_218_333] | 255 | noncontiguous | sum | fwd | 561151 | 352492 | 1.59195386 |
| NLLLoss | float16 | [64_21_218_333] | 255 | noncontiguous | sum | bwd | 659471 | 404015 | 1.632293355 |
| NLLLoss | float16 | [64_21_270_333] | 255 | noncontiguous | sum | fwd | 692127 | 490305 | 1.411625417 |
| NLLLoss | float16 | [64_21_270_333] | 255 | noncontiguous | sum | bwd | 884462 | 623080 | 1.419499904 |
| NLLLoss | float16 | [64_21_237_329] | 255 | noncontiguous | sum | fwd | 601838 | 398759 | 1.509277534 |
| NLLLoss | float16 | [64_21_237_329] | 255 | noncontiguous | sum | bwd | 708975 | 459503 | 1.542917021 |
| NLLLoss | float16 | [64_21_225_246] | 255 | noncontiguous | sum | fwd | 423295 | 235461 | 1.797728711 |
| NLLLoss | float16 | [64_21_225_246] | 255 | noncontiguous | sum | bwd | 532335 | 263282 | 2.021919463 |
| NLLLoss | float16 | [64_21_240_292] | 255 | noncontiguous | sum | fwd | 575871 | 333611 | 1.726175096 |
| NLLLoss | float16 | [64_21_240_292] | 255 | noncontiguous | sum | bwd | 623455 | 375868 | 1.658707312 |
| NLLLoss | float16 | [64_21_288_303] | 255 | noncontiguous | sum | fwd | 683230 | 441630 | 1.547064285 |
| NLLLoss | float16 | [64_21_288_303] | 255 | noncontiguous | sum | bwd | 777983 | 537060 | 1.44859606 |
| NLLLoss | float16 | [64_21_274_275] | 255 | noncontiguous | sum | fwd | 581791 | 348283 | 1.670454774 |
| NLLLoss | float16 | [64_21_274_275] | 255 | noncontiguous | sum | bwd | 687679 | 389313 | 1.766391053 |
| NLLLoss | float16 | [64_21_273_322] | 255 | noncontiguous | sum | fwd | 676431 | 474720 | 1.424905207 |
| NLLLoss | float16 | [64_21_273_322] | 255 | noncontiguous | sum | bwd | 859294 | 576320 | 1.491001527 |
| NLLLoss | float16 | [64_21_240_320] | 255 | noncontiguous | sum | fwd | 605343 | 390117 | 1.55169603 |
| NLLLoss | float16 | [64_21_240_320] | 255 | noncontiguous | sum | bwd | 672943 | 446651 | 1.506641651 |
| NLLLoss | float16 | [64_21_238_269] | 255 | noncontiguous | sum | fwd | 487567 | 296820 | 1.642635267 |
| NLLLoss | float16 | [64_21_238_269] | 255 | noncontiguous | sum | bwd | 604031 | 338829 | 1.782701599 |
| NLLLoss | float16 | [64_21_213_326] | 255 | noncontiguous | sum | fwd | 537567 | 319505 | 1.682499491 |
| NLLLoss | float16 | [64_21_213_326] | 255 | noncontiguous | sum | bwd | 639967 | 360644 | 1.774511707 |
| NLLLoss | float16 | [64_21_297_333] | 255 | noncontiguous | sum | fwd | 757167 | 564007 | 1.342478019 |
| NLLLoss | float16 | [64_21_297_333] | 255 | noncontiguous | sum | bwd | 970286 | 727191 | 1.334293191 |
| NLLLoss | float16 | [64_21_212_303] | 255 | noncontiguous | sum | fwd | 489663 | 295791 | 1.655435764 |
| NLLLoss | float16 | [64_21_212_303] | 255 | noncontiguous | sum | bwd | 629135 | 347543 | 1.810236431 |
| NLLLoss | float16 | [64_21_230_335] | 255 | noncontiguous | sum | fwd | 594815 | 370157 | 1.60692625 |
| NLLLoss | float16 | [64_21_230_335] | 255 | noncontiguous | sum | bwd | 699903 | 450798 | 1.552586746 |
| NLLLoss | float16 | [64_21_198_257] | 255 | noncontiguous | sum | fwd | 389359 | 209462 | 1.85885268 |
| NLLLoss | float16 | [64_21_198_257] | 255 | noncontiguous | sum | bwd | 490159 | 228324 | 2.146769503 |
| NLLLoss | float16 | [64_21_283_320] | 255 | noncontiguous | sum | fwd | 935598 | 496238 | 1.885381611 |
| NLLLoss | float16 | [64_21_283_320] | 255 | noncontiguous | sum | bwd | 879231 | 617742 | 1.423298076 |
| NLLLoss | float16 | [64_21_175_333] | 255 | noncontiguous | sum | fwd | 446271 | 259342 | 1.720781825 |
| NLLLoss | float16 | [64_21_175_333] | 255 | noncontiguous | sum | bwd | 562399 | 287820 | 1.953995553 |
| NLLLoss | float16 | [64_21_267_326] | 255 | noncontiguous | sum | fwd | 685007 | 463703 | 1.477253759 |
| NLLLoss | float16 | [64_21_267_326] | 255 | noncontiguous | sum | bwd | 789407 | 566435 | 1.39364093 |
| NLLLoss | float16 | [32_21_256_256] | 255 | noncontiguous | sum | fwd | 261600 | 161129 | 1.623543869 |
| NLLLoss | float16 | [32_21_256_256] | 255 | noncontiguous | sum | bwd | 1019871 | 163813 | 6.225824568 |
| NLLLoss | float16 | [55_21_112_257] | 255 | noncontiguous | sum | fwd | 195296 | 123460 | 1.581856472 |
| NLLLoss | float16 | [55_21_112_257] | 255 | noncontiguous | sum | bwd | 239472 | 145450 | 1.646421451 |
| NLLLoss | float16 | [24_21_512_512] | 255 | noncontiguous | sum | fwd | 770287 | 756619 | 1.018064574 |
| NLLLoss | float16 | [24_21_512_512] | 255 | noncontiguous | sum | bwd | 2092652 | 1133773 | 1.845741608 |
| NLLLoss | float16 | [16_21_512_512] | 255 | noncontiguous | mean | fwd | 515728 | 543328 | 0.949201955 |
| NLLLoss | float16 | [16_21_512_512] | 255 | noncontiguous | mean | bwd | 1272696 | 676817 | 1.88041376 |
| NLLLoss | float16 | [64_21_254_333] | 255 | noncontiguous | mean | fwd | 651435 | 438283 | 1.486334172 |
| NLLLoss | float16 | [64_21_254_333] | 255 | noncontiguous | mean | bwd | 780567 | 530441 | 1.471543489 |
| NLLLoss | float16 | [64_21_213_331] | 255 | noncontiguous | mean | fwd | 544879 | 332171 | 1.640356925 |
| NLLLoss | float16 | [64_21_213_331] | 255 | noncontiguous | mean | bwd | 643564 | 382268 | 1.68354139 |
| NLLLoss | float16 | [64_21_240_332] | 255 | noncontiguous | mean | fwd | 655675 | 404402 | 1.621344603 |
| NLLLoss | float16 | [64_21_240_332] | 255 | noncontiguous | mean | bwd | 708618 | 479849 | 1.476752062 |
| NLLLoss | float16 | [64_21_212_320] | 255 | noncontiguous | mean | fwd | 535471 | 317204 | 1.688096619 |
| NLLLoss | float16 | [64_21_212_320] | 255 | noncontiguous | mean | bwd | 591165 | 365008 | 1.619594639 |
| NLLLoss | float16 | [64_21_218_333] | 255 | noncontiguous | mean | fwd | 561086 | 352938 | 1.589757974 |
| NLLLoss | float16 | [64_21_218_333] | 255 | noncontiguous | mean | bwd | 662635 | 401044 | 1.652275062 |
| NLLLoss | float16 | [64_21_270_333] | 255 | noncontiguous | mean | fwd | 691773 | 489880 | 1.41212746 |
| NLLLoss | float16 | [64_21_270_333] | 255 | noncontiguous | mean | bwd | 880328 | 623176 | 1.41264747 |
| NLLLoss | float16 | [64_21_237_329] | 255 | noncontiguous | mean | fwd | 601410 | 398575 | 1.508900458 |
| NLLLoss | float16 | [64_21_237_329] | 255 | noncontiguous | mean | bwd | 708271 | 461223 | 1.535636774 |
| NLLLoss | float16 | [64_21_225_246] | 255 | noncontiguous | mean | fwd | 425783 | 236301 | 1.80186711 |
| NLLLoss | float16 | [64_21_225_246] | 255 | noncontiguous | mean | bwd | 533637 | 261118 | 2.043662252 |
| NLLLoss | float16 | [64_21_240_292] | 255 | noncontiguous | mean | fwd | 580692 | 333652 | 1.740412166 |
| NLLLoss | float16 | [64_21_240_292] | 255 | noncontiguous | mean | bwd | 640611 | 373954 | 1.713074335 |
| NLLLoss | float16 | [64_21_288_303] | 255 | noncontiguous | mean | fwd | 684179 | 442061 | 1.547702693 |
| NLLLoss | float16 | [64_21_288_303] | 255 | noncontiguous | mean | bwd | 801441 | 542967 | 1.47603998 |
| NLLLoss | float16 | [64_21_274_275] | 255 | noncontiguous | mean | fwd | 582661 | 352088 | 1.654873214 |
| NLLLoss | float16 | [64_21_274_275] | 255 | noncontiguous | mean | bwd | 690931 | 398400 | 1.734264558 |
| NLLLoss | float16 | [64_21_273_322] | 255 | noncontiguous | mean | fwd | 675460 | 473333 | 1.427029174 |
| NLLLoss | float16 | [64_21_273_322] | 255 | noncontiguous | mean | bwd | 858129 | 576213 | 1.489256577 |
| NLLLoss | float16 | [64_21_240_320] | 255 | noncontiguous | mean | fwd | 605142 | 388836 | 1.556291084 |
| NLLLoss | float16 | [64_21_240_320] | 255 | noncontiguous | mean | bwd | 672821 | 441387 | 1.524333521 |
| NLLLoss | float16 | [64_21_238_269] | 255 | noncontiguous | mean | fwd | 499512 | 296552 | 1.684399363 |
| NLLLoss | float16 | [64_21_238_269] | 255 | noncontiguous | mean | bwd | 602662 | 339112 | 1.777176862 |
| NLLLoss | float16 | [64_21_213_326] | 255 | noncontiguous | mean | fwd | 537767 | 319041 | 1.685573328 |
| NLLLoss | float16 | [64_21_213_326] | 255 | noncontiguous | mean | bwd | 638597 | 359059 | 1.778529434 |
| NLLLoss | float16 | [64_21_297_333] | 255 | noncontiguous | mean | fwd | 758500 | 565015 | 1.342442236 |
| NLLLoss | float16 | [64_21_297_333] | 255 | noncontiguous | mean | bwd | 968176 | 728660 | 1.32870749 |
| NLLLoss | float16 | [64_21_212_303] | 255 | noncontiguous | mean | fwd | 491768 | 293975 | 1.672822519 |
| NLLLoss | float16 | [64_21_212_303] | 255 | noncontiguous | mean | bwd | 614742 | 351486 | 1.748980045 |
| NLLLoss | float16 | [64_21_230_335] | 255 | noncontiguous | mean | fwd | 594615 | 370775 | 1.603708449 |
| NLLLoss | float16 | [64_21_230_335] | 255 | noncontiguous | mean | bwd | 700693 | 454598 | 1.54134642 |
| NLLLoss | float16 | [64_21_198_257] | 255 | noncontiguous | mean | fwd | 392634 | 209068 | 1.878020548 |
| NLLLoss | float16 | [64_21_198_257] | 255 | noncontiguous | mean | bwd | 490488 | 227112 | 2.159674522 |
| NLLLoss | float16 | [64_21_283_320] | 255 | noncontiguous | mean | fwd | 740148 | 496466 | 1.490833209 |
| NLLLoss | float16 | [64_21_283_320] | 255 | noncontiguous | mean | bwd | 869042 | 613942 | 1.415511563 |
| NLLLoss | float16 | [64_21_175_333] | 255 | noncontiguous | mean | fwd | 446313 | 257797 | 1.73125754 |
| NLLLoss | float16 | [64_21_175_333] | 255 | noncontiguous | mean | bwd | 562855 | 290846 | 1.935233766 |
| NLLLoss | float16 | [64_21_267_326] | 255 | noncontiguous | mean | fwd | 687894 | 461888 | 1.489309097 |
| NLLLoss | float16 | [64_21_267_326] | 255 | noncontiguous | mean | bwd | 789715 | 558493 | 1.414010561 |
| NLLLoss | float16 | [32_21_256_256] | 255 | noncontiguous | mean | fwd | 259788 | 161708 | 1.606525342 |
| NLLLoss | float16 | [32_21_256_256] | 255 | noncontiguous | mean | bwd | 1026048 | 163824 | 6.263111632 |
| NLLLoss | float16 | [55_21_112_257] | 255 | noncontiguous | mean | fwd | 195085 | 123112 | 1.58461401 |
| NLLLoss | float16 | [55_21_112_257] | 255 | noncontiguous | mean | bwd | 238188 | 145904 | 1.632498081 |
| NLLLoss | float16 | [24_21_512_512] | 255 | noncontiguous | mean | fwd | 767252 | 755740 | 1.015232752 |
| NLLLoss | float16 | [24_21_512_512] | 255 | noncontiguous | mean | bwd | 2069663 | 1137787 | 1.819025002 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | contiguous | none | fwd | 95492 | 90452 | 1.055720161 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | contiguous | none | bwd | 164775 | 93385 | 1.764469669 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | noncontiguous | none | fwd | 172297 | 250431 | 0.688001885 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | noncontiguous | none | bwd | 165708 | 233578 | 0.709433251 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | contiguous | mean | fwd | 163943 | 136851 | 1.197967132 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | contiguous | mean | bwd | 198872 | 86416 | 2.301333086 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | noncontiguous | mean | fwd | 245028 | 312758 | 0.78344279 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | noncontiguous | mean | bwd | 199280 | 229258 | 0.869239023 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | contiguous | sum | fwd | 160022 | 137295 | 1.165534069 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | contiguous | sum | bwd | 197592 | 86505 | 2.284168545 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | noncontiguous | sum | fwd | 239616 | 319514 | 0.74993897 |
| NLLLoss | float16 | 2 3 128 128 128 | -100 | noncontiguous | sum | bwd | 196415 | 226680 | 0.866485795 |
| NLLLoss | float16 | 256 81 8732 | -100 | contiguous | none | fwd | 396607 | 167588 | 2.366559658 |
| NLLLoss | float16 | 256 81 8732 | -100 | contiguous | none | bwd | 891761 | 376900 | 2.36604139 |
| NLLLoss | float16 | 256 81 8732 | -100 | noncontiguous | none | fwd | 1414219 | 91216 | 15.50406727 |
| NLLLoss | float16 | 256 81 8732 | -100 | noncontiguous | none | bwd | 892265 | 122469 | 7.285639631 |
| NLLLoss | float16 | 256 81 8732 | -100 | contiguous | mean | fwd | 175014 | 202556 | 0.864027726 |
| NLLLoss | float16 | 256 81 8732 | -100 | contiguous | mean | bwd | 770860 | 374660 | 2.057492126 |
| NLLLoss | float16 | 256 81 8732 | -100 | noncontiguous | mean | fwd | 1199184 | 127517 | 9.404110824 |
| NLLLoss | float16 | 256 81 8732 | -100 | noncontiguous | mean | bwd | 769962 | 108603 | 7.089693655 |
| NLLLoss | float16 | 256 81 8732 | -100 | contiguous | sum | fwd | 169237 | 201632 | 0.839336018 |
| NLLLoss | float16 | 256 81 8732 | -100 | contiguous | sum | bwd | 769451 | 377789 | 2.036721556 |
| NLLLoss | float16 | 256 81 8732 | -100 | noncontiguous | sum | fwd | 1194572 | 127623 | 9.360162353 |
| NLLLoss | float16 | 256 81 8732 | -100 | noncontiguous | sum | bwd | 767677 | 109154 | 7.032971765 |
| NLLLoss | float16 | 256 100 | -100 | contiguous | none | fwd | 10161 | 9298 | 1.092815659 |
| NLLLoss | float16 | 256 100 | -100 | contiguous | none | bwd | 14592 | 8000 | 1.824 |
| NLLLoss | float16 | 256 100 | -100 | noncontiguous | none | fwd | 16873 | 9209 | 1.832229341 |
| NLLLoss | float16 | 256 100 | -100 | noncontiguous | none | bwd | 20219 | 8035 | 2.516365899 |
| NLLLoss | float16 | 256 100 | -100 | contiguous | mean | fwd | 15808 | 15556 | 1.016199537 |
| NLLLoss | float16 | 256 100 | -100 | contiguous | mean | bwd | 14352 | 7058 | 2.033437234 |
| NLLLoss | float16 | 256 100 | -100 | noncontiguous | mean | fwd | 19128 | 15893 | 1.203548732 |
| NLLLoss | float16 | 256 100 | -100 | noncontiguous | mean | bwd | 21062 | 6844 | 3.077440094 |
| NLLLoss | float16 | 256 100 | -100 | contiguous | sum | fwd | 15568 | 15697 | 0.991781869 |
| NLLLoss | float16 | 256 100 | -100 | contiguous | sum | bwd | 12752 | 7058 | 1.80674412 |
| NLLLoss | float16 | 256 100 | -100 | noncontiguous | sum | fwd | 18488 | 16249 | 1.137793095 |
| NLLLoss | float16 | 256 100 | -100 | noncontiguous | sum | bwd | 20350 | 7058 | 2.883253046 |
| NLLLoss | float16 | 40 2 | -100 | contiguous | none | fwd | 8688 | 7449 | 1.166331051 |
| NLLLoss | float16 | 40 2 | -100 | contiguous | none | bwd | 14720 | 7253 | 2.029505032 |
| NLLLoss | float16 | 40 2 | -100 | noncontiguous | none | fwd | 11782 | 7396 | 1.593023256 |
| NLLLoss | float16 | 40 2 | -100 | noncontiguous | none | bwd | 17018 | 7289 | 2.334750995 |
| NLLLoss | float16 | 40 2 | -100 | contiguous | mean | fwd | 8256 | 13511 | 0.611057657 |
| NLLLoss | float16 | 40 2 | -100 | contiguous | mean | bwd | 8880 | 6578 | 1.349954393 |
| NLLLoss | float16 | 40 2 | -100 | noncontiguous | mean | fwd | 12102 | 14329 | 0.84458092 |
| NLLLoss | float16 | 40 2 | -100 | noncontiguous | mean | bwd | 14705 | 6400 | 2.29765625 |
| NLLLoss | float16 | 40 2 | -100 | contiguous | sum | fwd | 7776 | 14187 | 0.548107422 |
| NLLLoss | float16 | 40 2 | -100 | contiguous | sum | bwd | 7728 | 6382 | 1.210905672 |
| NLLLoss | float16 | 40 2 | -100 | noncontiguous | sum | fwd | 11985 | 13689 | 0.875520491 |
| NLLLoss | float16 | 40 2 | -100 | noncontiguous | sum | bwd | 13323 | 6418 | 2.075880337 |
| NLLLoss | float16 | 8192 52100 | -100 | contiguous | none | fwd | 28769 | 14437 | 1.992727021 |
| NLLLoss | float16 | 8192 52100 | -100 | contiguous | none | bwd | 784551 | 14260 | 55.01760168 |
| NLLLoss | float16 | 8192 52100 | -100 | noncontiguous | none | fwd | 2435181 | 14579 | 167.0334728 |
| NLLLoss | float16 | 8192 52100 | -100 | noncontiguous | none | bwd | 3194068 | 14455 | 220.9663092 |
| NLLLoss | float16 | 8192 52100 | -100 | contiguous | mean | fwd | 407147 | 24749 | 16.45104853 |
| NLLLoss | float16 | 8192 52100 | -100 | contiguous | mean | bwd | 1029516 | 12855 | 80.08681447 |
| NLLLoss | float16 | 8192 52100 | -100 | noncontiguous | mean | fwd | 2759594 | 24642 | 111.9874199 |
| NLLLoss | float16 | 8192 52100 | -100 | noncontiguous | mean | bwd | 3429560 | 12783 | 268.2906986 |
| NLLLoss | float16 | 8192 52100 | -100 | contiguous | sum | fwd | 2491153 | 24944 | 99.86982842 |
| NLLLoss | float16 | 8192 52100 | -100 | contiguous | sum | bwd | 1029995 | 12836 | 80.24267685 |
| NLLLoss | float16 | 8192 52100 | -100 | noncontiguous | sum | fwd | 2760448 | 25104 | 109.9604844 |
| NLLLoss | float16 | 8192 52100 | -100 | noncontiguous | sum | bwd | 3429032 | 12961 | 264.5653885 |
| NLLLoss | float16 | 20480 50000 | -100 | contiguous | none | fwd | 35521 | 20285 | 1.75109687 |
| NLLLoss | float16 | 20480 50000 | -100 | contiguous | none | bwd | 1852737 | 21547 | 85.9858449 |
| NLLLoss | float16 | 20480 50000 | -100 | noncontiguous | none | fwd | 5804885 | 19752 | 293.888467 |
| NLLLoss | float16 | 20480 50000 | -100 | noncontiguous | none | bwd | 7600988 | 21049 | 361.1092213 |
| NLLLoss | float16 | 20480 50000 | -100 | contiguous | mean | fwd | 989040 | 22027 | 44.90125755 |
| NLLLoss | float16 | 20480 50000 | -100 | contiguous | mean | bwd | 2420295 | 18898 | 128.071489 |
| NLLLoss | float16 | 20480 50000 | -100 | noncontiguous | mean | fwd | 6632306 | 28800 | 230.2884028 |
| NLLLoss | float16 | 20480 50000 | -100 | noncontiguous | mean | bwd | 8166788 | 19449 | 419.9078616 |
| NLLLoss | float16 | 20480 50000 | -100 | contiguous | sum | fwd | 987278 | 28125 | 35.10321778 |
| NLLLoss | float16 | 20480 50000 | -100 | contiguous | sum | bwd | 2418932 | 19289 | 125.4047385 |
| NLLLoss | float16 | 20480 50000 | -100 | noncontiguous | sum | fwd | 6631614 | 27697 | 239.4343792 |
| NLLLoss | float16 | 20480 50000 | -100 | noncontiguous | sum | bwd | 8173236 | 19164 | 426.489042 |
Nllloss float32
| op_name | dtype | size | ignore_index | contiguous | reduction | direction | rocm_kernel_avg | miopen_kernel_time | improvement over rocm |
|---|---|---|---|---|---|---|---|---|---|
| NLLLoss | float32 | [16_21_512_512] | 255 | contiguous | none | fwd | 998583 | 272498 | 3.664551666 |
| NLLLoss | float32 | [16_21_512_512] | 255 | contiguous | none | bwd | 1543684 | 600994 | 2.568551433 |
| NLLLoss | float32 | [16_21_512_512] | 255 | noncontiguous | none | fwd | 996327 | 564923 | 1.763650975 |
| NLLLoss | float32 | [16_21_512_512] | 255 | noncontiguous | none | bwd | 1548868 | 828425 | 1.869653861 |
| NLLLoss | float32 | [64_21_254_333] | 255 | contiguous | none | fwd | 1461617 | 325279 | 4.493425644 |
| NLLLoss | float32 | [64_21_254_333] | 255 | contiguous | none | bwd | 2081951 | 691963 | 3.008760584 |
| NLLLoss | float32 | [64_21_254_333] | 255 | noncontiguous | none | fwd | 1456849 | 484248 | 3.008477061 |
| NLLLoss | float32 | [64_21_254_333] | 255 | noncontiguous | none | bwd | 2090832 | 781438 | 2.675621099 |
| NLLLoss | float32 | [64_21_213_331] | 255 | contiguous | none | fwd | 1228539 | 273670 | 4.489125589 |
| NLLLoss | float32 | [64_21_213_331] | 255 | contiguous | none | bwd | 1672966 | 579110 | 2.888857039 |
| NLLLoss | float32 | [64_21_213_331] | 255 | noncontiguous | none | fwd | 1224780 | 359324 | 3.408567198 |
| NLLLoss | float32 | [64_21_213_331] | 255 | noncontiguous | none | bwd | 1676741 | 594328 | 2.821238441 |
| NLLLoss | float32 | [64_21_240_332] | 255 | contiguous | none | fwd | 1517298 | 296835 | 5.111587245 |
| NLLLoss | float32 | [64_21_240_332] | 255 | contiguous | none | bwd | 1995340 | 645776 | 3.089833007 |
| NLLLoss | float32 | [64_21_240_332] | 255 | noncontiguous | none | fwd | 1514402 | 445030 | 3.402921151 |
| NLLLoss | float32 | [64_21_240_332] | 255 | noncontiguous | none | bwd | 1978236 | 718683 | 2.752584937 |
| NLLLoss | float32 | [64_21_212_320] | 255 | contiguous | none | fwd | 1293980 | 278786 | 4.64148128 |
| NLLLoss | float32 | [64_21_212_320] | 255 | contiguous | none | bwd | 1649668 | 552187 | 2.98751691 |
| NLLLoss | float32 | [64_21_212_320] | 255 | noncontiguous | none | fwd | 1298509 | 343066 | 3.785012213 |
| NLLLoss | float32 | [64_21_212_320] | 255 | noncontiguous | none | bwd | 1651108 | 563742 | 2.928836241 |
| NLLLoss | float32 | [64_21_218_333] | 255 | contiguous | none | fwd | 1262620 | 280941 | 4.494253242 |
| NLLLoss | float32 | [64_21_218_333] | 255 | contiguous | none | bwd | 1767446 | 592581 | 2.982623473 |
| NLLLoss | float32 | [64_21_218_333] | 255 | noncontiguous | none | fwd | 1261515 | 385982 | 3.268325984 |
| NLLLoss | float32 | [64_21_218_333] | 255 | noncontiguous | none | bwd | 1758406 | 629699 | 2.792454808 |
| NLLLoss | float32 | [64_21_270_333] | 255 | contiguous | none | fwd | 1583442 | 346628 | 4.568130676 |
| NLLLoss | float32 | [64_21_270_333] | 255 | contiguous | none | bwd | 2216912 | 736685 | 3.009307913 |
| NLLLoss | float32 | [64_21_270_333] | 255 | noncontiguous | none | fwd | 1587010 | 554136 | 2.863935929 |
| NLLLoss | float32 | [64_21_270_333] | 255 | noncontiguous | none | bwd | 2213200 | 883575 | 2.504824152 |
| NLLLoss | float32 | [64_21_237_329] | 255 | contiguous | none | fwd | 1382030 | 300731 | 4.595568797 |
| NLLLoss | float32 | [64_21_237_329] | 255 | contiguous | none | bwd | 1885315 | 646243 | 2.91734688 |
| NLLLoss | float32 | [64_21_237_329] | 255 | noncontiguous | none | fwd | 1385036 | 442253 | 3.13177299 |
| NLLLoss | float32 | [64_21_237_329] | 255 | noncontiguous | none | bwd | 1881984 | 713173 | 2.63888846 |
| NLLLoss | float32 | [64_21_225_246] | 255 | contiguous | none | fwd | 925105 | 213555 | 4.331928543 |
| NLLLoss | float32 | [64_21_225_246] | 255 | contiguous | none | bwd | 1300289 | 454948 | 2.858104663 |
| NLLLoss | float32 | [64_21_225_246] | 255 | noncontiguous | none | fwd | 915312 | 257375 | 3.556336085 |
| NLLLoss | float32 | [64_21_225_246] | 255 | noncontiguous | none | bwd | 1282592 | 433385 | 2.959474832 |
| NLLLoss | float32 | [64_21_240_292] | 255 | contiguous | none | fwd | 1242777 | 259473 | 4.789619729 |
| NLLLoss | float32 | [64_21_240_292] | 255 | contiguous | none | bwd | 1687463 | 557738 | 3.025547838 |
| NLLLoss | float32 | [64_21_240_292] | 255 | noncontiguous | none | fwd | 1248265 | 356110 | 3.505279268 |
| NLLLoss | float32 | [64_21_240_292] | 255 | noncontiguous | none | bwd | 1684518 | 583959 | 2.884651148 |
| NLLLoss | float32 | [64_21_288_303] | 255 | contiguous | none | fwd | 1673182 | 337997 | 4.950286541 |
| NLLLoss | float32 | [64_21_288_303] | 255 | contiguous | none | bwd | 2164570 | 734819 | 2.945718606 |
| NLLLoss | float32 | [64_21_288_303] | 255 | noncontiguous | none | fwd | 1703533 | 490223 | 3.475016472 |
| NLLLoss | float32 | [64_21_288_303] | 255 | noncontiguous | none | bwd | 2186519 | 780062 | 2.803006684 |
| NLLLoss | float32 | [64_21_274_275] | 255 | contiguous | none | fwd | 1406124 | 290160 | 4.846029777 |
| NLLLoss | float32 | [64_21_274_275] | 255 | contiguous | none | bwd | 1921637 | 618558 | 3.106639959 |
| NLLLoss | float32 | [64_21_274_275] | 255 | noncontiguous | none | fwd | 1420923 | 377215 | 3.766878305 |
| NLLLoss | float32 | [64_21_274_275] | 255 | noncontiguous | none | bwd | 1913316 | 614541 | 3.113406591 |
| NLLLoss | float32 | [64_21_273_322] | 255 | contiguous | none | fwd | 1597557 | 338675 | 4.717079796 |
| NLLLoss | float32 | [64_21_273_322] | 255 | contiguous | none | bwd | 2167996 | 719625 | 3.012674657 |
| NLLLoss | float32 | [64_21_273_322] | 255 | noncontiguous | none | fwd | 1596805 | 527417 | 3.027594863 |
| NLLLoss | float32 | [64_21_273_322] | 255 | noncontiguous | none | bwd | 2169547 | 840493 | 2.581279083 |
| NLLLoss | float32 | [64_21_240_320] | 255 | contiguous | none | fwd | 1429974 | 286766 | 4.986553497 |
| NLLLoss | float32 | [64_21_240_320] | 255 | contiguous | none | bwd | 1861806 | 624145 | 2.982970303 |
| NLLLoss | float32 | [64_21_240_320] | 255 | noncontiguous | none | fwd | 2241110 | 423776 | 5.28843068 |
| NLLLoss | float32 | [64_21_240_320] | 255 | noncontiguous | none | bwd | 1861757 | 669370 | 2.781357097 |
| NLLLoss | float32 | [64_21_238_269] | 255 | contiguous | none | fwd | 1090139 | 246092 | 4.429802675 |
| NLLLoss | float32 | [64_21_238_269] | 255 | contiguous | none | bwd | 1504515 | 521375 | 2.885667706 |
| NLLLoss | float32 | [64_21_238_269] | 255 | noncontiguous | none | fwd | 1083899 | 324544 | 3.339759786 |
| NLLLoss | float32 | [64_21_238_269] | 255 | noncontiguous | none | bwd | 1508515 | 543259 | 2.776787867 |
| NLLLoss | float32 | [64_21_213_326] | 255 | contiguous | none | fwd | 1202984 | 267497 | 4.497186884 |
| NLLLoss | float32 | [64_21_213_326] | 255 | contiguous | none | bwd | 1647695 | 574690 | 2.867102264 |
| NLLLoss | float32 | [64_21_213_326] | 255 | noncontiguous | none | fwd | 1206360 | 345397 | 3.492676543 |
| NLLLoss | float32 | [64_21_213_326] | 255 | noncontiguous | none | bwd | 1647327 | 575650 | 2.861681577 |
| NLLLoss | float32 | [64_21_297_333] | 255 | contiguous | none | fwd | 1845434 | 387975 | 4.756579677 |
| NLLLoss | float32 | [64_21_297_333] | 255 | contiguous | none | bwd | 2477646 | 827113 | 2.995535072 |
| NLLLoss | float32 | [64_21_297_333] | 255 | noncontiguous | none | fwd | 1878650 | 626921 | 2.996629559 |
| NLLLoss | float32 | [64_21_297_333] | 255 | noncontiguous | none | bwd | 2474302 | 972888 | 2.543254722 |
| NLLLoss | float32 | [64_21_212_303] | 255 | contiguous | none | fwd | 1102377 | 256421 | 4.299090168 |
| NLLLoss | float32 | [64_21_212_303] | 255 | contiguous | none | bwd | 1517361 | 540737 | 2.806097974 |
| NLLLoss | float32 | [64_21_212_303] | 255 | noncontiguous | none | fwd | 1094537 | 320012 | 3.420299864 |
| NLLLoss | float32 | [64_21_212_303] | 255 | noncontiguous | none | bwd | 1517553 | 537555 | 2.823065547 |
| NLLLoss | float32 | [64_21_230_335] | 255 | contiguous | none | fwd | 1409203 | 297666 | 4.734175217 |
| NLLLoss | float32 | [64_21_230_335] | 255 | contiguous | none | bwd | 1936152 | 632594 | 3.060655017 |
| NLLLoss | float32 | [64_21_230_335] | 255 | noncontiguous | none | fwd | 1402580 | 410607 | 3.415869676 |
| NLLLoss | float32 | [64_21_230_335] | 255 | noncontiguous | none | bwd | 1916921 | 676291 | 2.834461792 |
| NLLLoss | float32 | [64_21_198_257] | 255 | contiguous | none | fwd | 806544 | 200281 | 4.027061978 |
| NLLLoss | float32 | [64_21_198_257] | 255 | contiguous | none | bwd | 1187064 | 412580 | 2.877172912 |
| NLLLoss | float32 | [64_21_198_257] | 255 | noncontiguous | none | fwd | 800016 | 224583 | 3.562228664 |
| NLLLoss | float32 | [64_21_198_257] | 255 | noncontiguous | none | bwd | 1182856 | 385398 | 3.069180432 |
| NLLLoss | float32 | [64_21_283_320] | 255 | contiguous | none | fwd | 1762525 | 336244 | 5.241803571 |
| NLLLoss | float32 | [64_21_283_320] | 255 | contiguous | none | bwd | 2278179 | 734478 | 3.101766152 |
| NLLLoss | float32 | [64_21_283_320] | 255 | noncontiguous | none | fwd | 1757693 | 539530 | 3.257822549 |
| NLLLoss | float32 | [64_21_283_320] | 255 | noncontiguous | none | bwd | 2284946 | 851596 | 2.683133786 |
| NLLLoss | float32 | [64_21_175_333] | 255 | contiguous | none | fwd | 875855 | 226806 | 3.861692371 |
| NLLLoss | float32 | [64_21_175_333] | 255 | contiguous | none | bwd | 1291607 | 476313 | 2.711676986 |
| NLLLoss | float32 | [64_21_175_333] | 255 | noncontiguous | none | fwd | 862255 | 276956 | 3.113328471 |
| NLLLoss | float32 | [64_21_175_333] | 255 | noncontiguous | none | bwd | 1290454 | 465593 | 2.771635312 |
| NLLLoss | float32 | [64_21_267_326] | 255 | contiguous | none | fwd | 1648992 | 349257 | 4.721428633 |
| NLLLoss | float32 | [64_21_267_326] | 255 | contiguous | none | bwd | 2203893 | 731279 | 3.01375125 |
| NLLLoss | float32 | [64_21_267_326] | 255 | noncontiguous | none | fwd | 1646576 | 506660 | 3.249863814 |
| NLLLoss | float32 | [64_21_267_326] | 255 | noncontiguous | none | bwd | 2221669 | 811331 | 2.73830163 |
| NLLLoss | float32 | [32_21_256_256] | 255 | contiguous | none | fwd | 471623 | 126877 | 3.71716702 |
| NLLLoss | float32 | [32_21_256_256] | 255 | contiguous | none | bwd | 848928 | 264547 | 3.208987439 |
| NLLLoss | float32 | [32_21_256_256] | 255 | noncontiguous | none | fwd | 468679 | 160371 | 2.922467279 |
| NLLLoss | float32 | [32_21_256_256] | 255 | noncontiguous | none | bwd | 848208 | 262930 | 3.225984102 |
| NLLLoss | float32 | [55_21_112_257] | 255 | contiguous | none | fwd | 267451 | 110505 | 2.420261527 |
| NLLLoss | float32 | [55_21_112_257] | 255 | contiguous | none | bwd | 517974 | 207695 | 2.49391656 |
| NLLLoss | float32 | [55_21_112_257] | 255 | noncontiguous | none | fwd | 279291 | 138131 | 2.021928459 |
| NLLLoss | float32 | [55_21_112_257] | 255 | noncontiguous | none | bwd | 517942 | 221010 | 2.343522918 |
| NLLLoss | float32 | [24_21_512_512] | 255 | contiguous | none | fwd | 1664177 | 404422 | 4.114951709 |
| NLLLoss | float32 | [24_21_512_512] | 255 | contiguous | none | bwd | 2476978 | 895935 | 2.764684938 |
| NLLLoss | float32 | [24_21_512_512] | 255 | noncontiguous | none | fwd | 1645458 | 782319 | 2.103308241 |
| NLLLoss | float32 | [24_21_512_512] | 255 | noncontiguous | none | bwd | 2504066 | 1331502 | 1.880632549 |
| NLLLoss | float32 | [16_21_512_512] | 255 | noncontiguous | sum | fwd | 589775 | 604730 | 0.975269955 |
| NLLLoss | float32 | [16_21_512_512] | 255 | noncontiguous | sum | bwd | 1305245 | 809744 | 1.611923028 |
| NLLLoss | float32 | [64_21_254_333] | 255 | noncontiguous | sum | fwd | 849326 | 536917 | 1.581857159 |
| NLLLoss | float32 | [64_21_254_333] | 255 | noncontiguous | sum | bwd | 1178238 | 718266 | 1.640392278 |
| NLLLoss | float32 | [64_21_213_331] | 255 | noncontiguous | sum | fwd | 708335 | 406487 | 1.742577253 |
| NLLLoss | float32 | [64_21_213_331] | 255 | noncontiguous | sum | bwd | 989998 | 542018 | 1.826503917 |
| NLLLoss | float32 | [64_21_240_332] | 255 | noncontiguous | sum | fwd | 710798 | 494120 | 1.438512912 |
| NLLLoss | float32 | [64_21_240_332] | 255 | noncontiguous | sum | bwd | 1047054 | 660462 | 1.58533572 |
| NLLLoss | float32 | [64_21_212_320] | 255 | noncontiguous | sum | fwd | 607807 | 386121 | 1.574136087 |
| NLLLoss | float32 | [64_21_212_320] | 255 | noncontiguous | sum | bwd | 894366 | 517761 | 1.727372282 |
| NLLLoss | float32 | [64_21_218_333] | 255 | noncontiguous | sum | fwd | 729870 | 432544 | 1.687389029 |
| NLLLoss | float32 | [64_21_218_333] | 255 | noncontiguous | sum | bwd | 1007790 | 572489 | 1.760365701 |
| NLLLoss | float32 | [64_21_270_333] | 255 | noncontiguous | sum | fwd | 903406 | 601804 | 1.501163169 |
| NLLLoss | float32 | [64_21_270_333] | 255 | noncontiguous | sum | bwd | 1252109 | 823536 | 1.520405908 |
| NLLLoss | float32 | [64_21_237_329] | 255 | noncontiguous | sum | fwd | 784878 | 488340 | 1.607236761 |
| NLLLoss | float32 | [64_21_237_329] | 255 | noncontiguous | sum | bwd | 1082238 | 652210 | 1.659339783 |
| NLLLoss | float32 | [64_21_225_246] | 255 | noncontiguous | sum | fwd | 538943 | 297166 | 1.813609229 |
| NLLLoss | float32 | [64_21_225_246] | 255 | noncontiguous | sum | bwd | 767679 | 401359 | 1.912699105 |
| NLLLoss | float32 | [64_21_240_292] | 255 | noncontiguous | sum | fwd | 626719 | 405220 | 1.546614185 |
| NLLLoss | float32 | [64_21_240_292] | 255 | noncontiguous | sum | bwd | 924222 | 533964 | 1.730869497 |
| NLLLoss | float32 | [64_21_288_303] | 255 | noncontiguous | sum | fwd | 875534 | 544066 | 1.609242261 |
| NLLLoss | float32 | [64_21_288_303] | 255 | noncontiguous | sum | bwd | 1193854 | 734748 | 1.624848247 |
| NLLLoss | float32 | [64_21_274_275] | 255 | noncontiguous | sum | fwd | 757726 | 427785 | 1.771277628 |
| NLLLoss | float32 | [64_21_274_275] | 255 | noncontiguous | sum | bwd | 1047678 | 565900 | 1.851348295 |
| NLLLoss | float32 | [64_21_273_322] | 255 | noncontiguous | sum | fwd | 895774 | 580321 | 1.543583637 |
| NLLLoss | float32 | [64_21_273_322] | 255 | noncontiguous | sum | bwd | 1236254 | 773494 | 1.598272256 |
| NLLLoss | float32 | [64_21_240_320] | 255 | noncontiguous | sum | fwd | 687615 | 478687 | 1.436460568 |
| NLLLoss | float32 | [64_21_240_320] | 255 | noncontiguous | sum | bwd | 1011038 | 623754 | 1.620892211 |
| NLLLoss | float32 | [64_21_238_269] | 255 | noncontiguous | sum | fwd | 622574 | 369372 | 1.685493216 |
| NLLLoss | float32 | [64_21_238_269] | 255 | noncontiguous | sum | bwd | 880159 | 503773 | 1.747134126 |
| NLLLoss | float32 | [64_21_213_326] | 255 | noncontiguous | sum | fwd | 698159 | 394458 | 1.769919738 |
| NLLLoss | float32 | [64_21_213_326] | 255 | noncontiguous | sum | bwd | 966814 | 527881 | 1.831499902 |
| NLLLoss | float32 | [64_21_297_333] | 255 | noncontiguous | sum | fwd | 991582 | 680151 | 1.457885087 |
| NLLLoss | float32 | [64_21_297_333] | 255 | noncontiguous | sum | bwd | 1355726 | 916403 | 1.479399347 |
| NLLLoss | float32 | [64_21_212_303] | 255 | noncontiguous | sum | fwd | 624126 | 365250 | 1.70876386 |
| NLLLoss | float32 | [64_21_212_303] | 255 | noncontiguous | sum | bwd | 886175 | 503066 | 1.761548187 |
| NLLLoss | float32 | [64_21_230_335] | 255 | noncontiguous | sum | fwd | 775567 | 462230 | 1.677881141 |
| NLLLoss | float32 | [64_21_230_335] | 255 | noncontiguous | sum | bwd | 1069278 | 636295 | 1.680475251 |
| NLLLoss | float32 | [64_21_198_257] | 255 | noncontiguous | sum | fwd | 497743 | 265818 | 1.872495467 |
| NLLLoss | float32 | [64_21_198_257] | 255 | noncontiguous | sum | bwd | 707855 | 356131 | 1.98762534 |
| NLLLoss | float32 | [64_21_283_320] | 255 | noncontiguous | sum | fwd | 799311 | 593199 | 1.347458441 |
| NLLLoss | float32 | [64_21_283_320] | 255 | noncontiguous | sum | bwd | 1194974 | 804667 | 1.485054066 |
| NLLLoss | float32 | [64_21_175_333] | 255 | noncontiguous | sum | fwd | 570831 | 317880 | 1.795743677 |
| NLLLoss | float32 | [64_21_175_333] | 255 | noncontiguous | sum | bwd | 821439 | 431668 | 1.902941613 |
| NLLLoss | float32 | [64_21_267_326] | 255 | noncontiguous | sum | fwd | 874366 | 566738 | 1.542804612 |
| NLLLoss | float32 | [64_21_267_326] | 255 | noncontiguous | sum | bwd | 1239982 | 758602 | 1.634561997 |
| NLLLoss | float32 | [32_21_256_256] | 255 | noncontiguous | sum | fwd | 288960 | 187118 | 1.544266185 |
| NLLLoss | float32 | [32_21_256_256] | 255 | noncontiguous | sum | bwd | 1106558 | 246528 | 4.48856925 |
| NLLLoss | float32 | [55_21_112_257] | 255 | noncontiguous | sum | fwd | 244351 | 165591 | 1.475629714 |
| NLLLoss | float32 | [55_21_112_257] | 255 | noncontiguous | sum | bwd | 340575 | 222656 | 1.529601717 |
| NLLLoss | float32 | [24_21_512_512] | 255 | noncontiguous | sum | fwd | 881167 | 846644 | 1.040776288 |
| NLLLoss | float32 | [24_21_512_512] | 255 | noncontiguous | sum | bwd | 1873869 | 1302142 | 1.439066553 |
| NLLLoss | float32 | [16_21_512_512] | 255 | noncontiguous | mean | fwd | 588701 | 605975 | 0.971493874 |
| NLLLoss | float32 | [16_21_512_512] | 255 | noncontiguous | mean | bwd | 1305942 | 811641 | 1.60901433 |
| NLLLoss | float32 | [64_21_254_333] | 255 | noncontiguous | mean | fwd | 850405 | 536788 | 1.584247412 |
| NLLLoss | float32 | [64_21_254_333] | 255 | noncontiguous | mean | bwd | 1175563 | 720446 | 1.631715632 |
| NLLLoss | float32 | [64_21_213_331] | 255 | noncontiguous | mean | fwd | 709146 | 405147 | 1.750342468 |
| NLLLoss | float32 | [64_21_213_331] | 255 | noncontiguous | mean | bwd | 975121 | 542247 | 1.798296717 |
| NLLLoss | float32 | [64_21_240_332] | 255 | noncontiguous | mean | fwd | 711002 | 495156 | 1.435915146 |
| NLLLoss | float32 | [64_21_240_332] | 255 | noncontiguous | mean | bwd | 1049359 | 661162 | 1.587143544 |
| NLLLoss | float32 | [64_21_212_320] | 255 | noncontiguous | mean | fwd | 607725 | 389150 | 1.561672877 |
| NLLLoss | float32 | [64_21_212_320] | 255 | noncontiguous | mean | bwd | 894292 | 516135 | 1.732670716 |
| NLLLoss | float32 | [64_21_218_333] | 255 | noncontiguous | mean | fwd | 730650 | 432973 | 1.687518621 |
| NLLLoss | float32 | [64_21_218_333] | 255 | noncontiguous | mean | bwd | 1007218 | 572100 | 1.760562839 |
| NLLLoss | float32 | [64_21_270_333] | 255 | noncontiguous | mean | fwd | 902793 | 602839 | 1.497569003 |
| NLLLoss | float32 | [64_21_270_333] | 255 | noncontiguous | mean | bwd | 1250752 | 823103 | 1.519557091 |
| NLLLoss | float32 | [64_21_237_329] | 255 | noncontiguous | mean | fwd | 784846 | 487125 | 1.611179882 |
| NLLLoss | float32 | [64_21_237_329] | 255 | noncontiguous | mean | bwd | 1081943 | 654164 | 1.653932347 |
| NLLLoss | float32 | [64_21_225_246] | 255 | noncontiguous | mean | fwd | 544181 | 296763 | 1.833722533 |
| NLLLoss | float32 | [64_21_225_246] | 255 | noncontiguous | mean | bwd | 769312 | 398984 | 1.928177571 |
| NLLLoss | float32 | [64_21_240_292] | 255 | noncontiguous | mean | fwd | 627331 | 405190 | 1.548239098 |
| NLLLoss | float32 | [64_21_240_292] | 255 | noncontiguous | mean | bwd | 922413 | 535002 | 1.724130003 |
| NLLLoss | float32 | [64_21_288_303] | 255 | noncontiguous | mean | fwd | 875695 | 547589 | 1.599182964 |
| NLLLoss | float32 | [64_21_288_303] | 255 | noncontiguous | mean | bwd | 1195737 | 734362 | 1.628266441 |
| NLLLoss | float32 | [64_21_274_275] | 255 | noncontiguous | mean | fwd | 758402 | 429724 | 1.764858374 |
| NLLLoss | float32 | [64_21_274_275] | 255 | noncontiguous | mean | bwd | 1046317 | 567502 | 1.843723899 |
| NLLLoss | float32 | [64_21_273_322] | 255 | noncontiguous | mean | fwd | 881969 | 580196 | 1.52012251 |
| NLLLoss | float32 | [64_21_273_322] | 255 | noncontiguous | mean | bwd | 1230730 | 774436 | 1.589195234 |
| NLLLoss | float32 | [64_21_240_320] | 255 | noncontiguous | mean | fwd | 690420 | 477156 | 1.446948168 |
| NLLLoss | float32 | [64_21_240_320] | 255 | noncontiguous | mean | bwd | 1015743 | 619841 | 1.638715413 |
| NLLLoss | float32 | [64_21_238_269] | 255 | noncontiguous | mean | fwd | 624822 | 369156 | 1.692568995 |
| NLLLoss | float32 | [64_21_238_269] | 255 | noncontiguous | mean | bwd | 892113 | 502703 | 1.774632338 |
| NLLLoss | float32 | [64_21_213_326] | 255 | noncontiguous | mean | fwd | 699269 | 393903 | 1.775231466 |
| NLLLoss | float32 | [64_21_213_326] | 255 | noncontiguous | mean | bwd | 966512 | 528446 | 1.828970226 |
| NLLLoss | float32 | [64_21_297_333] | 255 | noncontiguous | mean | fwd | 991504 | 680216 | 1.457631105 |
| NLLLoss | float32 | [64_21_297_333] | 255 | noncontiguous | mean | bwd | 1355450 | 916377 | 1.479140136 |
| NLLLoss | float32 | [64_21_212_303] | 255 | noncontiguous | mean | fwd | 623894 | 365744 | 1.705821558 |
| NLLLoss | float32 | [64_21_212_303] | 255 | noncontiguous | mean | bwd | 885426 | 501121 | 1.766890631 |
| NLLLoss | float32 | [64_21_230_335] | 255 | noncontiguous | mean | fwd | 777700 | 462740 | 1.680641397 |
| NLLLoss | float32 | [64_21_230_335] | 255 | noncontiguous | mean | bwd | 1069583 | 636110 | 1.681443461 |
| NLLLoss | float32 | [64_21_198_257] | 255 | noncontiguous | mean | fwd | 499368 | 265033 | 1.884172914 |
| NLLLoss | float32 | [64_21_198_257] | 255 | noncontiguous | mean | bwd | 705301 | 354349 | 1.990413406 |
| NLLLoss | float32 | [64_21_283_320] | 255 | noncontiguous | mean | fwd | 807491 | 593301 | 1.361014055 |
| NLLLoss | float32 | [64_21_283_320] | 255 | noncontiguous | mean | bwd | 1189486 | 804148 | 1.479187911 |
| NLLLoss | float32 | [64_21_175_333] | 255 | noncontiguous | mean | fwd | 570247 | 319167 | 1.786672808 |
| NLLLoss | float32 | [64_21_175_333] | 255 | noncontiguous | mean | bwd | 819475 | 433176 | 1.891783017 |
| NLLLoss | float32 | [64_21_267_326] | 255 | noncontiguous | mean | fwd | 893666 | 568289 | 1.572555513 |
| NLLLoss | float32 | [64_21_267_326] | 255 | noncontiguous | mean | bwd | 1215629 | 754441 | 1.611297636 |
| NLLLoss | float32 | [32_21_256_256] | 255 | noncontiguous | mean | fwd | 290299 | 187948 | 1.544570839 |
| NLLLoss | float32 | [32_21_256_256] | 255 | noncontiguous | mean | bwd | 1102975 | 246331 | 4.477613455 |
| NLLLoss | float32 | [55_21_112_257] | 255 | noncontiguous | mean | fwd | 243868 | 165192 | 1.476270037 |
| NLLLoss | float32 | [55_21_112_257] | 255 | noncontiguous | mean | bwd | 341755 | 222952 | 1.532863576 |
| NLLLoss | float32 | [24_21_512_512] | 255 | noncontiguous | mean | fwd | 880434 | 847616 | 1.038718004 |
| NLLLoss | float32 | [24_21_512_512] | 255 | noncontiguous | mean | bwd | 1867123 | 1305291 | 1.430426625 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | contiguous | none | fwd | 108820 | 95323 | 1.14159227 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | contiguous | none | bwd | 217112 | 111465 | 1.947804243 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | noncontiguous | none | fwd | 198436 | 263248 | 0.753798699 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | noncontiguous | none | bwd | 217434 | 247871 | 0.877206289 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | contiguous | mean | fwd | 167046 | 140602 | 1.188076983 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | contiguous | mean | bwd | 214152 | 97456 | 2.197422427 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | noncontiguous | mean | fwd | 262017 | 324367 | 0.80777946 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | noncontiguous | mean | bwd | 214088 | 234929 | 0.911288091 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | contiguous | sum | fwd | 164294 | 140371 | 1.17042694 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | contiguous | sum | bwd | 214184 | 97349 | 2.200166412 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | noncontiguous | sum | fwd | 258788 | 312154 | 0.829039513 |
| NLLLoss | float32 | 2 3 128 128 128 | -100 | noncontiguous | sum | bwd | 214030 | 238591 | 0.897058146 |
| NLLLoss | float32 | 256 81 8732 | -100 | contiguous | none | fwd | 436751 | 199890 | 2.184956726 |
| NLLLoss | float32 | 256 81 8732 | -100 | contiguous | none | bwd | 1093797 | 441664 | 2.476536462 |
| NLLLoss | float32 | 256 81 8732 | -100 | noncontiguous | none | fwd | 1776074 | 107589 | 16.50795156 |
| NLLLoss | float32 | 256 81 8732 | -100 | noncontiguous | none | bwd | 2382248 | 180815 | 13.17505738 |
| NLLLoss | float32 | 256 81 8732 | -100 | contiguous | mean | fwd | 212087 | 233525 | 0.908198266 |
| NLLLoss | float32 | 256 81 8732 | -100 | contiguous | mean | bwd | 981265 | 433024 | 2.266075321 |
| NLLLoss | float32 | 256 81 8732 | -100 | noncontiguous | mean | fwd | 1544778 | 142771 | 10.81997044 |
| NLLLoss | float32 | 256 81 8732 | -100 | noncontiguous | mean | bwd | 978418 | 151731 | 6.448372449 |
| NLLLoss | float32 | 256 81 8732 | -100 | contiguous | sum | fwd | 207207 | 233507 | 0.887369544 |
| NLLLoss | float32 | 256 81 8732 | -100 | contiguous | sum | bwd | 981088 | 434980 | 2.255478413 |
| NLLLoss | float32 | 256 81 8732 | -100 | noncontiguous | sum | fwd | 1537518 | 142486 | 10.79066014 |
| NLLLoss | float32 | 256 81 8732 | -100 | noncontiguous | sum | bwd | 979988 | 151571 | 6.465537603 |
| NLLLoss | float32 | 256 100 | -100 | contiguous | none | fwd | 9936 | 9244 | 1.074859368 |
| NLLLoss | float32 | 256 100 | -100 | contiguous | none | bwd | 14960 | 7680 | 1.947916667 |
| NLLLoss | float32 | 256 100 | -100 | noncontiguous | none | fwd | 16888 | 9191 | 1.837449679 |
| NLLLoss | float32 | 256 100 | -100 | noncontiguous | none | bwd | 20612 | 8409 | 2.451183256 |
| NLLLoss | float32 | 256 100 | -100 | contiguous | mean | fwd | 15936 | 15433 | 1.032592497 |
| NLLLoss | float32 | 256 100 | -100 | contiguous | mean | bwd | 13505 | 6774 | 1.9936522 |
| NLLLoss | float32 | 256 100 | -100 | noncontiguous | mean | fwd | 19041 | 15718 | 1.211413666 |
| NLLLoss | float32 | 256 100 | -100 | noncontiguous | mean | bwd | 20961 | 6632 | 3.160585042 |
| NLLLoss | float32 | 256 100 | -100 | contiguous | sum | fwd | 15553 | 15734 | 0.98849625 |
| NLLLoss | float32 | 256 100 | -100 | contiguous | sum | bwd | 13073 | 6952 | 1.880466053 |
| NLLLoss | float32 | 256 100 | -100 | noncontiguous | sum | fwd | 18459 | 15806 | 1.167847653 |
| NLLLoss | float32 | 256 100 | -100 | noncontiguous | sum | bwd | 20160 | 6739 | 2.991541772 |
| NLLLoss | float32 | 40 2 | -100 | contiguous | none | fwd | 8416 | 7449 | 1.129816083 |
| NLLLoss | float32 | 40 2 | -100 | contiguous | none | bwd | 14400 | 7502 | 1.919488136 |
| NLLLoss | float32 | 40 2 | -100 | noncontiguous | none | fwd | 11622 | 7413 | 1.567786321 |
| NLLLoss | float32 | 40 2 | -100 | noncontiguous | none | bwd | 16945 | 7467 | 2.269318334 |
| NLLLoss | float32 | 40 2 | -100 | contiguous | mean | fwd | 8464 | 13920 | 0.608045977 |
| NLLLoss | float32 | 40 2 | -100 | contiguous | mean | bwd | 9168 | 6204 | 1.477756286 |
| NLLLoss | float32 | 40 2 | -100 | noncontiguous | mean | fwd | 12480 | 14044 | 0.888635716 |
| NLLLoss | float32 | 40 2 | -100 | noncontiguous | mean | bwd | 14603 | 6293 | 2.320514858 |
| NLLLoss | float32 | 40 2 | -100 | contiguous | sum | fwd | 7216 | 14293 | 0.50486252 |
| NLLLoss | float32 | 40 2 | -100 | contiguous | sum | bwd | 8128 | 6044 | 1.344804765 |
| NLLLoss | float32 | 40 2 | -100 | noncontiguous | sum | fwd | 11738 | 14044 | 0.835801766 |
| NLLLoss | float32 | 40 2 | -100 | noncontiguous | sum | bwd | 13440 | 6133 | 2.191423447 |
| NLLLoss | float32 | 8192 52100 | -100 | contiguous | none | fwd | 33745 | 15574 | 2.166752279 |
| NLLLoss | float32 | 8192 52100 | -100 | contiguous | none | bwd | 1184589 | 25957 | 45.63659129 |
| NLLLoss | float32 | 8192 52100 | -100 | noncontiguous | none | fwd | 3299934 | 15396 | 214.3371005 |
| NLLLoss | float32 | 8192 52100 | -100 | noncontiguous | none | bwd | 4476791 | 24890 | 179.8630374 |
| NLLLoss | float32 | 8192 52100 | -100 | contiguous | mean | fwd | 1735257 | 33797 | 51.34352161 |
| NLLLoss | float32 | 8192 52100 | -100 | contiguous | mean | bwd | 1459906 | 24659 | 59.20377955 |
| NLLLoss | float32 | 8192 52100 | -100 | noncontiguous | mean | fwd | 3642497 | 34135 | 106.7085689 |
| NLLLoss | float32 | 8192 52100 | -100 | noncontiguous | mean | bwd | 4738202 | 14152 | 334.8079423 |
| NLLLoss | float32 | 8192 52100 | -100 | contiguous | sum | fwd | 458394 | 35771 | 12.81468228 |
| NLLLoss | float32 | 8192 52100 | -100 | contiguous | sum | bwd | 1457040 | 14170 | 102.8256881 |
| NLLLoss | float32 | 8192 52100 | -100 | noncontiguous | sum | fwd | 3649286 | 25565 | 142.7453941 |
| NLLLoss | float32 | 8192 52100 | -100 | noncontiguous | sum | bwd | 4741964 | 24428 | 194.1200262 |
| NLLLoss | float32 | 20480 50000 | -100 | contiguous | none | fwd | 42704 | 23501 | 1.817114165 |
| NLLLoss | float32 | 20480 50000 | -100 | contiguous | none | bwd | 2804726 | 29812 | 94.08043741 |
| NLLLoss | float32 | 20480 50000 | -100 | noncontiguous | none | fwd | 7850643 | 23217 | 338.142008 |
| NLLLoss | float32 | 20480 50000 | -100 | noncontiguous | none | bwd | 10698810 | 23591 | 453.5123564 |
| NLLLoss | float32 | 20480 50000 | -100 | contiguous | mean | fwd | 1125945 | 31590 | 35.64245014 |
| NLLLoss | float32 | 20480 50000 | -100 | contiguous | mean | bwd | 3437690 | 27963 | 122.9370954 |
| NLLLoss | float32 | 20480 50000 | -100 | noncontiguous | mean | fwd | 8752699 | 37706 | 232.1301384 |
| NLLLoss | float32 | 20480 50000 | -100 | noncontiguous | mean | bwd | 11695705 | 28764 | 406.6091295 |
| NLLLoss | float32 | 20480 50000 | -100 | contiguous | sum | fwd | 1121600 | 35430 | 31.65678803 |
| NLLLoss | float32 | 20480 50000 | -100 | contiguous | sum | bwd | 3445887 | 26844 | 128.3671211 |
| NLLLoss | float32 | 20480 50000 | -100 | noncontiguous | sum | fwd | 8711305 | 32585 | 267.3409544 |
| NLLLoss | float32 | 20480 50000 | -100 | noncontiguous | sum | bwd | 11648068 | 26008 | 447.8648108 |
Nllloss bfloat16
| op_name | dtype | size | ignore_index | contiguous | reduction | direction | rocm_kernel_avg | miopen_kernel_time | improvement over rocm |
|---|---|---|---|---|---|---|---|---|---|
| NLLLoss | bfloat16 | [16_21_512_512] | 255 | contiguous | none | fwd | 891301 | 194631 | 4.579440069 |
| NLLLoss | bfloat16 | [16_21_512_512] | 255 | contiguous | none | bwd | 1436993 | 371946 | 3.863445231 |
| NLLLoss | bfloat16 | [16_21_512_512] | 255 | noncontiguous | none | fwd | 889589 | 500353 | 1.777922787 |
| NLLLoss | bfloat16 | [16_21_512_512] | 255 | noncontiguous | none | bwd | 1440529 | 684550 | 2.10434446 |
| NLLLoss | bfloat16 | [64_21_254_333] | 255 | contiguous | none | fwd | 1311214 | 223626 | 5.863423752 |
| NLLLoss | bfloat16 | [64_21_254_333] | 255 | contiguous | none | bwd | 1826042 | 426097 | 4.28550776 |
| NLLLoss | bfloat16 | [64_21_254_333] | 255 | noncontiguous | none | fwd | 1310094 | 379199 | 3.454898352 |
| NLLLoss | bfloat16 | [64_21_254_333] | 255 | noncontiguous | none | bwd | 1824665 | 582256 | 3.133784796 |
| NLLLoss | bfloat16 | [64_21_213_331] | 255 | contiguous | none | fwd | 1101801 | 184889 | 5.959256635 |
| NLLLoss | bfloat16 | [64_21_213_331] | 255 | contiguous | none | bwd | 1516514 | 365332 | 4.151057121 |
| NLLLoss | bfloat16 | [64_21_213_331] | 255 | noncontiguous | none | fwd | 1111721 | 277475 | 4.006562753 |
| NLLLoss | bfloat16 | [64_21_213_331] | 255 | noncontiguous | none | bwd | 1510322 | 420106 | 3.595097428 |
| NLLLoss | bfloat16 | [64_21_240_332] | 255 | contiguous | none | fwd | 1374686 | 209773 | 6.553207515 |
| NLLLoss | bfloat16 | [64_21_240_332] | 255 | contiguous | none | bwd | 1792920 | 398338 | 4.501001662 |
| NLLLoss | bfloat16 | [64_21_240_332] | 255 | noncontiguous | none | fwd | 1361070 | 348680 | 3.903493174 |
| NLLLoss | bfloat16 | [64_21_240_332] | 255 | noncontiguous | none | bwd | 1785368 | 518870 | 3.440877291 |
| NLLLoss | bfloat16 | [64_21_212_320] | 255 | contiguous | none | fwd | 1132329 | 193425 | 5.854098488 |
| NLLLoss | bfloat16 | [64_21_212_320] | 255 | contiguous | none | bwd | 1499585 | 336702 | 4.453745448 |
| NLLLoss | bfloat16 | [64_21_212_320] | 255 | noncontiguous | none | fwd | 1134489 | 266202 | 4.261759867 |
| NLLLoss | bfloat16 | [64_21_212_320] | 255 | noncontiguous | none | bwd | 1501793 | 389286 | 3.857814049 |
| NLLLoss | bfloat16 | [64_21_218_333] | 255 | contiguous | none | fwd | 1140008 | 190440 | 5.986179374 |
| NLLLoss | bfloat16 | [64_21_218_333] | 255 | contiguous | none | bwd | 1582738 | 363549 | 4.353575447 |
| NLLLoss | bfloat16 | [64_21_218_333] | 255 | noncontiguous | none | fwd | 1137961 | 297829 | 3.820853577 |
| NLLLoss | bfloat16 | [64_21_218_333] | 255 | noncontiguous | none | bwd | 1592499 | 449552 | 3.542413336 |
| NLLLoss | bfloat16 | [64_21_270_333] | 255 | contiguous | none | fwd | 1483838 | 234955 | 6.31541359 |
| NLLLoss | bfloat16 | [64_21_270_333] | 255 | contiguous | none | bwd | 2009448 | 449663 | 4.468786625 |
| NLLLoss | bfloat16 | [64_21_270_333] | 255 | noncontiguous | none | fwd | 1477033 | 428669 | 3.445625879 |
| NLLLoss | bfloat16 | [64_21_270_333] | 255 | noncontiguous | none | bwd | 2016082 | 659430 | 3.057310101 |
| NLLLoss | bfloat16 | [64_21_237_329] | 255 | contiguous | none | fwd | 1249832 | 203848 | 6.131195793 |
| NLLLoss | bfloat16 | [64_21_237_329] | 255 | contiguous | none | bwd | 1722699 | 392354 | 4.390675258 |
| NLLLoss | bfloat16 | [64_21_237_329] | 255 | noncontiguous | none | fwd | 1248198 | 340837 | 3.662155224 |
| NLLLoss | bfloat16 | [64_21_237_329] | 255 | noncontiguous | none | bwd | 1725448 | 517646 | 3.333258636 |
| NLLLoss | bfloat16 | [64_21_225_246] | 255 | contiguous | none | fwd | 832399 | 145984 | 5.701987889 |
| NLLLoss | bfloat16 | [64_21_225_246] | 255 | contiguous | none | bwd | 1182239 | 270210 | 4.375259983 |
| NLLLoss | bfloat16 | [64_21_225_246] | 255 | noncontiguous | none | fwd | 822718 | 191832 | 4.288742233 |
| NLLLoss | bfloat16 | [64_21_225_246] | 255 | noncontiguous | none | bwd | 1183438 | 282868 | 4.18371113 |
| NLLLoss | bfloat16 | [64_21_240_292] | 255 | contiguous | none | fwd | 1125128 | 182891 | 6.151904686 |
| NLLLoss | bfloat16 | [64_21_240_292] | 255 | contiguous | none | bwd | 1548005 | 353248 | 4.38220457 |
| NLLLoss | bfloat16 | [64_21_240_292] | 255 | noncontiguous | none | fwd | 1145191 | 281056 | 4.074600791 |
| NLLLoss | bfloat16 | [64_21_240_292] | 255 | noncontiguous | none | bwd | 1553396 | 413478 | 3.756901214 |
| NLLLoss | bfloat16 | [64_21_288_303] | 255 | contiguous | none | fwd | 1579565 | 230695 | 6.846984113 |
| NLLLoss | bfloat16 | [64_21_288_303] | 255 | contiguous | none | bwd | 2040840 | 442083 | 4.616418184 |
| NLLLoss | bfloat16 | [64_21_288_303] | 255 | noncontiguous | none | fwd | 1578764 | 382974 | 4.122379065 |
| NLLLoss | bfloat16 | [64_21_288_303] | 255 | noncontiguous | none | bwd | 2041446 | 575732 | 3.545826878 |
| NLLLoss | bfloat16 | [64_21_274_275] | 255 | contiguous | none | fwd | 1293404 | 197185 | 6.559342749 |
| NLLLoss | bfloat16 | [64_21_274_275] | 255 | contiguous | none | bwd | 1722390 | 378869 | 4.546136 |
| NLLLoss | bfloat16 | [64_21_274_275] | 255 | noncontiguous | none | fwd | 1283692 | 295174 | 4.348933172 |
| NLLLoss | bfloat16 | [64_21_274_275] | 255 | noncontiguous | none | bwd | 1713030 | 434120 | 3.945982678 |
| NLLLoss | bfloat16 | [64_21_273_322] | 255 | contiguous | none | fwd | 1484342 | 230892 | 6.428728583 |
| NLLLoss | bfloat16 | [64_21_273_322] | 255 | contiguous | none | bwd | 1978638 | 446335 | 4.433078293 |
| NLLLoss | bfloat16 | [64_21_273_322] | 255 | noncontiguous | none | fwd | 1463958 | 413482 | 3.540560411 |
| NLLLoss | bfloat16 | [64_21_273_322] | 255 | noncontiguous | none | bwd | 1977101 | 628268 | 3.146907052 |
| NLLLoss | bfloat16 | [64_21_240_320] | 255 | contiguous | none | fwd | 1308136 | 207959 | 6.29035531 |
| NLLLoss | bfloat16 | [64_21_240_320] | 255 | contiguous | none | bwd | 1717039 | 390124 | 4.401264726 |
| NLLLoss | bfloat16 | [64_21_240_320] | 255 | noncontiguous | none | fwd | 1305879 | 334214 | 3.907313877 |
| NLLLoss | bfloat16 | [64_21_240_320] | 255 | noncontiguous | none | bwd | 1715776 | 490460 | 3.498299556 |
| NLLLoss | bfloat16 | [64_21_238_269] | 255 | contiguous | none | fwd | 1017900 | 165419 | 6.153464838 |
| NLLLoss | bfloat16 | [64_21_238_269] | 255 | contiguous | none | bwd | 1391045 | 314447 | 4.423782068 |
| NLLLoss | bfloat16 | [64_21_238_269] | 255 | noncontiguous | none | fwd | 1007692 | 247212 | 4.076226073 |
| NLLLoss | bfloat16 | [64_21_238_269] | 255 | noncontiguous | none | bwd | 1389701 | 370481 | 3.751072255 |
| NLLLoss | bfloat16 | [64_21_213_326] | 255 | contiguous | none | fwd | 1070602 | 181579 | 5.896067277 |
| NLLLoss | bfloat16 | [64_21_213_326] | 255 | contiguous | none | bwd | 1499410 | 360259 | 4.162033426 |
| NLLLoss | bfloat16 | [64_21_213_326] | 255 | noncontiguous | none | fwd | 1065402 | 265790 | 4.008435231 |
| NLLLoss | bfloat16 | [64_21_213_326] | 255 | noncontiguous | none | bwd | 1495538 | 391014 | 3.824768423 |
| NLLLoss | bfloat16 | [64_21_297_333] | 255 | contiguous | none | fwd | 1779036 | 263497 | 6.751636641 |
| NLLLoss | bfloat16 | [64_21_297_333] | 255 | contiguous | none | bwd | 2316529 | 521146 | 4.445067217 |
| NLLLoss | bfloat16 | [64_21_297_333] | 255 | noncontiguous | none | fwd | 1799691 | 499760 | 3.601110533 |
| NLLLoss | bfloat16 | [64_21_297_333] | 255 | noncontiguous | none | bwd | 2315601 | 763470 | 3.032995403 |
| NLLLoss | bfloat16 | [64_21_212_303] | 255 | contiguous | none | fwd | 1012636 | 168939 | 5.994092542 |
| NLLLoss | bfloat16 | [64_21_212_303] | 255 | contiguous | none | bwd | 1381220 | 312475 | 4.420257621 |
| NLLLoss | bfloat16 | [64_21_212_303] | 255 | noncontiguous | none | fwd | 1006540 | 243765 | 4.129140771 |
| NLLLoss | bfloat16 | [64_21_212_303] | 255 | noncontiguous | none | bwd | 1380276 | 366127 | 3.769937754 |
| NLLLoss | bfloat16 | [64_21_230_335] | 255 | contiguous | none | fwd | 1218807 | 201366 | 6.052695093 |
| NLLLoss | bfloat16 | [64_21_230_335] | 255 | contiguous | none | bwd | 1690414 | 387460 | 4.362809064 |
| NLLLoss | bfloat16 | [64_21_230_335] | 255 | noncontiguous | none | fwd | 1237591 | 314555 | 3.934418464 |
| NLLLoss | bfloat16 | [64_21_230_335] | 255 | noncontiguous | none | bwd | 1689230 | 483370 | 3.494693506 |
| NLLLoss | bfloat16 | [64_21_198_257] | 255 | contiguous | none | fwd | 704946 | 136691 | 5.157223226 |
| NLLLoss | bfloat16 | [64_21_198_257] | 255 | contiguous | none | bwd | 1075050 | 255072 | 4.214692322 |
| NLLLoss | bfloat16 | [64_21_198_257] | 255 | noncontiguous | none | fwd | 699554 | 164869 | 4.243089968 |
| NLLLoss | bfloat16 | [64_21_198_257] | 255 | noncontiguous | none | bwd | 1077147 | 242965 | 4.433342251 |
| NLLLoss | bfloat16 | [64_21_283_320] | 255 | contiguous | none | fwd | 1656751 | 239730 | 6.910903934 |
| NLLLoss | bfloat16 | [64_21_283_320] | 255 | contiguous | none | bwd | 2113478 | 462073 | 4.573904989 |
| NLLLoss | bfloat16 | [64_21_283_320] | 255 | noncontiguous | none | fwd | 1633936 | 434820 | 3.757729635 |
| NLLLoss | bfloat16 | [64_21_283_320] | 255 | noncontiguous | none | bwd | 2116694 | 645466 | 3.279326874 |
| NLLLoss | bfloat16 | [64_21_175_333] | 255 | contiguous | none | fwd | 809440 | 151767 | 5.333438758 |
| NLLLoss | bfloat16 | [64_21_175_333] | 255 | contiguous | none | bwd | 1175465 | 297293 | 3.95389397 |
| NLLLoss | bfloat16 | [64_21_175_333] | 255 | noncontiguous | none | fwd | 782289 | 213739 | 3.660019931 |
| NLLLoss | bfloat16 | [64_21_175_333] | 255 | noncontiguous | none | bwd | 1177001 | 309667 | 3.800860279 |
| NLLLoss | bfloat16 | [64_21_267_326] | 255 | contiguous | none | fwd | 1476068 | 232654 | 6.344477206 |
| NLLLoss | bfloat16 | [64_21_267_326] | 255 | contiguous | none | bwd | 1982890 | 448865 | 4.417564301 |
| NLLLoss | bfloat16 | [64_21_267_326] | 255 | noncontiguous | none | fwd | 1467348 | 402039 | 3.649765321 |
| NLLLoss | bfloat16 | [64_21_267_326] | 255 | noncontiguous | none | bwd | 1980362 | 604881 | 3.273969591 |
| NLLLoss | bfloat16 | [32_21_256_256] | 255 | contiguous | none | fwd | 379865 | 90239 | 4.209543545 |
| NLLLoss | bfloat16 | [32_21_256_256] | 255 | contiguous | none | bwd | 714098 | 159073 | 4.489121347 |
| NLLLoss | bfloat16 | [32_21_256_256] | 255 | noncontiguous | none | fwd | 374649 | 130700 | 2.86648049 |
| NLLLoss | bfloat16 | [32_21_256_256] | 255 | noncontiguous | none | bwd | 719282 | 179340 | 4.010717074 |
| NLLLoss | bfloat16 | [55_21_112_257] | 255 | contiguous | none | fwd | 243068 | 71146 | 3.416467546 |
| NLLLoss | bfloat16 | [55_21_112_257] | 255 | contiguous | none | bwd | 455880 | 123963 | 3.677548946 |
| NLLLoss | bfloat16 | [55_21_112_257] | 255 | noncontiguous | none | fwd | 247819 | 94488 | 2.622756329 |
| NLLLoss | bfloat16 | [55_21_112_257] | 255 | noncontiguous | none | bwd | 458424 | 157758 | 2.905868482 |
| NLLLoss | bfloat16 | [24_21_512_512] | 255 | contiguous | none | fwd | 1549300 | 293063 | 5.286576606 |
| NLLLoss | bfloat16 | [24_21_512_512] | 255 | contiguous | none | bwd | 2463139 | 553522 | 4.449938756 |
| NLLLoss | bfloat16 | [24_21_512_512] | 255 | noncontiguous | none | fwd | 1547188 | 690107 | 2.241953784 |
| NLLLoss | bfloat16 | [24_21_512_512] | 255 | noncontiguous | none | bwd | 2457636 | 1165105 | 2.109368684 |
| NLLLoss | bfloat16 | [16_21_512_512] | 255 | noncontiguous | sum | fwd | 513999 | 543995 | 0.944859787 |
| NLLLoss | bfloat16 | [16_21_512_512] | 255 | noncontiguous | sum | bwd | 1254477 | 666530 | 1.882101331 |
| NLLLoss | bfloat16 | [64_21_254_333] | 255 | noncontiguous | sum | fwd | 651071 | 441317 | 1.475291004 |
| NLLLoss | bfloat16 | [64_21_254_333] | 255 | noncontiguous | sum | bwd | 783310 | 542233 | 1.444600384 |
| NLLLoss | bfloat16 | [64_21_213_331] | 255 | noncontiguous | sum | fwd | 544943 | 334000 | 1.631565868 |
| NLLLoss | bfloat16 | [64_21_213_331] | 255 | noncontiguous | sum | bwd | 647871 | 383462 | 1.689531166 |
| NLLLoss | bfloat16 | [64_21_240_332] | 255 | noncontiguous | sum | fwd | 655310 | 404352 | 1.620642411 |
| NLLLoss | bfloat16 | [64_21_240_332] | 255 | noncontiguous | sum | bwd | 711870 | 477335 | 1.491342558 |
| NLLLoss | bfloat16 | [64_21_212_320] | 255 | noncontiguous | sum | fwd | 535535 | 318413 | 1.681887988 |
| NLLLoss | bfloat16 | [64_21_212_320] | 255 | noncontiguous | sum | bwd | 592207 | 357940 | 1.654486785 |
| NLLLoss | bfloat16 | [64_21_218_333] | 255 | noncontiguous | sum | fwd | 560239 | 353171 | 1.586310881 |
| NLLLoss | bfloat16 | [64_21_218_333] | 255 | noncontiguous | sum | bwd | 666815 | 408652 | 1.631742901 |
| NLLLoss | bfloat16 | [64_21_270_333] | 255 | noncontiguous | sum | fwd | 691167 | 490557 | 1.408943303 |
| NLLLoss | bfloat16 | [64_21_270_333] | 255 | noncontiguous | sum | bwd | 888078 | 607139 | 1.462725998 |
| NLLLoss | bfloat16 | [64_21_237_329] | 255 | noncontiguous | sum | fwd | 600974 | 399881 | 1.502882107 |
| NLLLoss | bfloat16 | [64_21_237_329] | 255 | noncontiguous | sum | bwd | 712767 | 472644 | 1.508041994 |
| NLLLoss | bfloat16 | [64_21_225_246] | 255 | noncontiguous | sum | fwd | 422815 | 236866 | 1.785038798 |
| NLLLoss | bfloat16 | [64_21_225_246] | 255 | noncontiguous | sum | bwd | 536767 | 259870 | 2.065521222 |
| NLLLoss | bfloat16 | [64_21_240_292] | 255 | noncontiguous | sum | fwd | 578527 | 334288 | 1.730624491 |
| NLLLoss | bfloat16 | [64_21_240_292] | 255 | noncontiguous | sum | bwd | 624943 | 377985 | 1.653353969 |
| NLLLoss | bfloat16 | [64_21_288_303] | 255 | noncontiguous | sum | fwd | 685151 | 441098 | 1.553285211 |
| NLLLoss | bfloat16 | [64_21_288_303] | 255 | noncontiguous | sum | bwd | 786158 | 538147 | 1.460861066 |
| NLLLoss | bfloat16 | [64_21_274_275] | 255 | noncontiguous | sum | fwd | 580191 | 346061 | 1.676557023 |
| NLLLoss | bfloat16 | [64_21_274_275] | 255 | noncontiguous | sum | bwd | 693199 | 401119 | 1.728162964 |
| NLLLoss | bfloat16 | [64_21_273_322] | 255 | noncontiguous | sum | fwd | 675534 | 475574 | 1.420460328 |
| NLLLoss | bfloat16 | [64_21_273_322] | 255 | noncontiguous | sum | bwd | 868878 | 573353 | 1.515432901 |
| NLLLoss | bfloat16 | [64_21_240_320] | 255 | noncontiguous | sum | fwd | 604623 | 388731 | 1.55537634 |
| NLLLoss | bfloat16 | [64_21_240_320] | 255 | noncontiguous | sum | bwd | 675471 | 450012 | 1.50100664 |
| NLLLoss | bfloat16 | [64_21_238_269] | 255 | noncontiguous | sum | fwd | 487359 | 297354 | 1.638985855 |
| NLLLoss | bfloat16 | [64_21_238_269] | 255 | noncontiguous | sum | bwd | 605679 | 342670 | 1.767528526 |
| NLLLoss | bfloat16 | [64_21_213_326] | 255 | noncontiguous | sum | fwd | 536895 | 318368 | 1.686397502 |
| NLLLoss | bfloat16 | [64_21_213_326] | 255 | noncontiguous | sum | bwd | 643471 | 362244 | 1.776346882 |
| NLLLoss | bfloat16 | [64_21_297_333] | 255 | noncontiguous | sum | fwd | 757454 | 564648 | 1.341462292 |
| NLLLoss | bfloat16 | [64_21_297_333] | 255 | noncontiguous | sum | bwd | 975694 | 716739 | 1.361296092 |
| NLLLoss | bfloat16 | [64_21_212_303] | 255 | noncontiguous | sum | fwd | 488831 | 295827 | 1.652421855 |
| NLLLoss | bfloat16 | [64_21_212_303] | 255 | noncontiguous | sum | bwd | 616191 | 345161 | 1.785227763 |
| NLLLoss | bfloat16 | [64_21_230_335] | 255 | noncontiguous | sum | fwd | 593599 | 369730 | 1.605493198 |
| NLLLoss | bfloat16 | [64_21_230_335] | 255 | noncontiguous | sum | bwd | 705583 | 450159 | 1.567408405 |
| NLLLoss | bfloat16 | [64_21_198_257] | 255 | noncontiguous | sum | fwd | 396640 | 209871 | 1.889922857 |
| NLLLoss | bfloat16 | [64_21_198_257] | 255 | noncontiguous | sum | bwd | 496303 | 228289 | 2.174011888 |
| NLLLoss | bfloat16 | [64_21_283_320] | 255 | noncontiguous | sum | fwd | 739887 | 496781 | 1.489362516 |
| NLLLoss | bfloat16 | [64_21_283_320] | 255 | noncontiguous | sum | bwd | 871711 | 613217 | 1.421537563 |
| NLLLoss | bfloat16 | [64_21_175_333] | 255 | noncontiguous | sum | fwd | 445343 | 260303 | 1.710863878 |
| NLLLoss | bfloat16 | [64_21_175_333] | 255 | noncontiguous | sum | bwd | 567695 | 287180 | 1.976791559 |
| NLLLoss | bfloat16 | [64_21_267_326] | 255 | noncontiguous | sum | fwd | 670335 | 463066 | 1.447601422 |
| NLLLoss | bfloat16 | [64_21_267_326] | 255 | noncontiguous | sum | bwd | 797710 | 566028 | 1.409311907 |
| NLLLoss | bfloat16 | [32_21_256_256] | 255 | noncontiguous | sum | fwd | 278608 | 161520 | 1.724913323 |
| NLLLoss | bfloat16 | [32_21_256_256] | 255 | noncontiguous | sum | bwd | 1012126 | 161484 | 6.267655 |
| NLLLoss | bfloat16 | [55_21_112_257] | 255 | noncontiguous | sum | fwd | 197520 | 123051 | 1.605188093 |
| NLLLoss | bfloat16 | [55_21_112_257] | 255 | noncontiguous | sum | bwd | 245552 | 144330 | 1.701323356 |
| NLLLoss | bfloat16 | [24_21_512_512] | 255 | noncontiguous | sum | fwd | 765486 | 756658 | 1.011667094 |
| NLLLoss | bfloat16 | [24_21_512_512] | 255 | noncontiguous | sum | bwd | 1951034 | 1137849 | 1.714668642 |
| NLLLoss | bfloat16 | [16_21_512_512] | 255 | noncontiguous | mean | fwd | 514528 | 544680 | 0.944642726 |
| NLLLoss | bfloat16 | [16_21_512_512] | 255 | noncontiguous | mean | bwd | 1230089 | 667699 | 1.842280728 |
| NLLLoss | bfloat16 | [64_21_254_333] | 255 | noncontiguous | mean | fwd | 649979 | 440150 | 1.476721572 |
| NLLLoss | bfloat16 | [64_21_254_333] | 255 | noncontiguous | mean | bwd | 782903 | 541090 | 1.446899776 |
| NLLLoss | bfloat16 | [64_21_213_331] | 255 | noncontiguous | mean | fwd | 544239 | 332883 | 1.634925785 |
| NLLLoss | bfloat16 | [64_21_213_331] | 255 | noncontiguous | mean | bwd | 649180 | 384455 | 1.688572135 |
| NLLLoss | bfloat16 | [64_21_240_332] | 255 | noncontiguous | mean | fwd | 655851 | 405789 | 1.616236517 |
| NLLLoss | bfloat16 | [64_21_240_332] | 255 | noncontiguous | mean | bwd | 711498 | 476419 | 1.493429103 |
| NLLLoss | bfloat16 | [64_21_212_320] | 255 | noncontiguous | mean | fwd | 535999 | 318289 | 1.684001018 |
| NLLLoss | bfloat16 | [64_21_212_320] | 255 | noncontiguous | mean | bwd | 592974 | 359053 | 1.651494348 |
| NLLLoss | bfloat16 | [64_21_218_333] | 255 | noncontiguous | mean | fwd | 561632 | 353081 | 1.5906605 |
| NLLLoss | bfloat16 | [64_21_218_333] | 255 | noncontiguous | mean | bwd | 666589 | 407889 | 1.634241178 |
| NLLLoss | bfloat16 | [64_21_270_333] | 255 | noncontiguous | mean | fwd | 692383 | 491729 | 1.408058097 |
| NLLLoss | bfloat16 | [64_21_270_333] | 255 | noncontiguous | mean | bwd | 892250 | 609061 | 1.464959996 |
| NLLLoss | bfloat16 | [64_21_237_329] | 255 | noncontiguous | mean | fwd | 601331 | 399588 | 1.504877524 |
| NLLLoss | bfloat16 | [64_21_237_329] | 255 | noncontiguous | mean | bwd | 717552 | 472939 | 1.517218923 |
| NLLLoss | bfloat16 | [64_21_225_246] | 255 | noncontiguous | mean | fwd | 424407 | 236408 | 1.795231126 |
| NLLLoss | bfloat16 | [64_21_225_246] | 255 | noncontiguous | mean | bwd | 537445 | 260994 | 2.059223584 |
| NLLLoss | bfloat16 | [64_21_240_292] | 255 | noncontiguous | mean | fwd | 577877 | 333972 | 1.730315715 |
| NLLLoss | bfloat16 | [64_21_240_292] | 255 | noncontiguous | mean | bwd | 624356 | 379110 | 1.646899317 |
| NLLLoss | bfloat16 | [64_21_288_303] | 255 | noncontiguous | mean | fwd | 685283 | 441937 | 1.550635045 |
| NLLLoss | bfloat16 | [64_21_288_303] | 255 | noncontiguous | mean | bwd | 790402 | 537350 | 1.47092584 |
| NLLLoss | bfloat16 | [64_21_274_275] | 255 | noncontiguous | mean | fwd | 581094 | 349546 | 1.662424974 |
| NLLLoss | bfloat16 | [64_21_274_275] | 255 | noncontiguous | mean | bwd | 693988 | 397582 | 1.745521679 |
| NLLLoss | bfloat16 | [64_21_273_322] | 255 | noncontiguous | mean | fwd | 676404 | 476125 | 1.420643739 |
| NLLLoss | bfloat16 | [64_21_273_322] | 255 | noncontiguous | mean | bwd | 870545 | 572320 | 1.521080864 |
| NLLLoss | bfloat16 | [64_21_240_320] | 255 | noncontiguous | mean | fwd | 604534 | 388961 | 1.554227802 |
| NLLLoss | bfloat16 | [64_21_240_320] | 255 | noncontiguous | mean | bwd | 679365 | 452196 | 1.502368442 |
| NLLLoss | bfloat16 | [64_21_238_269] | 255 | noncontiguous | mean | fwd | 488280 | 298650 | 1.634957308 |
| NLLLoss | bfloat16 | [64_21_238_269] | 255 | noncontiguous | mean | bwd | 613062 | 343663 | 1.783904581 |
| NLLLoss | bfloat16 | [64_21_213_326] | 255 | noncontiguous | mean | fwd | 535976 | 319450 | 1.677808734 |
| NLLLoss | bfloat16 | [64_21_213_326] | 255 | noncontiguous | mean | bwd | 642166 | 363894 | 1.764706206 |
| NLLLoss | bfloat16 | [64_21_297_333] | 255 | noncontiguous | mean | fwd | 758116 | 567309 | 1.336336987 |
| NLLLoss | bfloat16 | [64_21_297_333] | 255 | noncontiguous | mean | bwd | 982576 | 718030 | 1.368433074 |
| NLLLoss | bfloat16 | [64_21_212_303] | 255 | noncontiguous | mean | fwd | 489112 | 297548 | 1.64380873 |
| NLLLoss | bfloat16 | [64_21_212_303] | 255 | noncontiguous | mean | bwd | 620198 | 345228 | 1.796488118 |
| NLLLoss | bfloat16 | [64_21_230_335] | 255 | noncontiguous | mean | fwd | 594503 | 372784 | 1.594765333 |
| NLLLoss | bfloat16 | [64_21_230_335] | 255 | noncontiguous | mean | bwd | 708709 | 452091 | 1.56762466 |
| NLLLoss | bfloat16 | [64_21_198_257] | 255 | noncontiguous | mean | fwd | 391514 | 210010 | 1.864263606 |
| NLLLoss | bfloat16 | [64_21_198_257] | 255 | noncontiguous | mean | bwd | 494776 | 228019 | 2.169889351 |
| NLLLoss | bfloat16 | [64_21_283_320] | 255 | noncontiguous | mean | fwd | 742260 | 497941 | 1.490658532 |
| NLLLoss | bfloat16 | [64_21_283_320] | 255 | noncontiguous | mean | bwd | 871971 | 613657 | 1.420941992 |
| NLLLoss | bfloat16 | [64_21_175_333] | 255 | noncontiguous | mean | fwd | 444745 | 260268 | 1.708796318 |
| NLLLoss | bfloat16 | [64_21_175_333] | 255 | noncontiguous | mean | bwd | 568759 | 287806 | 1.976188822 |
| NLLLoss | bfloat16 | [64_21_267_326] | 255 | noncontiguous | mean | fwd | 669862 | 463932 | 1.443879707 |
| NLLLoss | bfloat16 | [64_21_267_326] | 255 | noncontiguous | mean | bwd | 796579 | 568271 | 1.40175902 |
| NLLLoss | bfloat16 | [32_21_256_256] | 255 | noncontiguous | mean | fwd | 260348 | 160606 | 1.621035329 |
| NLLLoss | bfloat16 | [32_21_256_256] | 255 | noncontiguous | mean | bwd | 1003296 | 161459 | 6.213936665 |
| NLLLoss | bfloat16 | [55_21_112_257] | 255 | noncontiguous | mean | fwd | 195629 | 123894 | 1.579003019 |
| NLLLoss | bfloat16 | [55_21_112_257] | 255 | noncontiguous | mean | bwd | 240604 | 145779 | 1.650470918 |
| NLLLoss | bfloat16 | [24_21_512_512] | 255 | noncontiguous | mean | fwd | 767348 | 756664 | 1.014119874 |
| NLLLoss | bfloat16 | [24_21_512_512] | 255 | noncontiguous | mean | bwd | 1953938 | 1142196 | 1.710685381 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | contiguous | none | fwd | 102052 | 93705 | 1.089077424 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | contiguous | none | bwd | 169734 | 98362 | 1.725605417 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | noncontiguous | none | fwd | 177912 | 260066 | 0.684103266 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | noncontiguous | none | bwd | 168646 | 240547 | 0.701093757 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | contiguous | mean | fwd | 169607 | 139197 | 1.218467352 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | contiguous | mean | bwd | 199783 | 88958 | 2.245812631 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | noncontiguous | mean | fwd | 252039 | 311443 | 0.809262048 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | noncontiguous | mean | bwd | 200953 | 213276 | 0.942220409 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | contiguous | sum | fwd | 166566 | 139286 | 1.195856009 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | contiguous | sum | bwd | 200568 | 89065 | 2.251928367 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | noncontiguous | sum | fwd | 247093 | 313985 | 0.786957976 |
| NLLLoss | bfloat16 | 2 3 128 128 128 | -100 | noncontiguous | sum | bwd | 200342 | 238840 | 0.838812594 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | contiguous | none | fwd | 394940 | 164851 | 2.395739183 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | contiguous | none | bwd | 895980 | 377203 | 2.375325753 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | noncontiguous | none | fwd | 1411041 | 91287 | 15.45719544 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | noncontiguous | none | bwd | 4962643 | 122825 | 40.40417667 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | contiguous | mean | fwd | 208918 | 202219 | 1.033127451 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | contiguous | mean | bwd | 771240 | 373861 | 2.062905732 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | noncontiguous | mean | fwd | 1238918 | 128317 | 9.655135329 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | noncontiguous | mean | bwd | 769085 | 109474 | 7.025275408 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | contiguous | sum | fwd | 205830 | 202895 | 1.01446561 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | contiguous | sum | bwd | 770375 | 374110 | 2.05922055 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | noncontiguous | sum | fwd | 1233389 | 128122 | 9.626676137 |
| NLLLoss | bfloat16 | 256 81 8732 | -100 | noncontiguous | sum | bwd | 769259 | 109687 | 7.013219433 |
| NLLLoss | bfloat16 | 256 100 | -100 | contiguous | none | fwd | 10224 | 9264 | 1.103626943 |
| NLLLoss | bfloat16 | 256 100 | -100 | contiguous | none | bwd | 15056 | 7948 | 1.894313035 |
| NLLLoss | bfloat16 | 256 100 | -100 | noncontiguous | none | fwd | 16291 | 9210 | 1.768838219 |
| NLLLoss | bfloat16 | 256 100 | -100 | noncontiguous | none | bwd | 21135 | 7841 | 2.695447009 |
| NLLLoss | bfloat16 | 256 100 | -100 | contiguous | mean | fwd | 15760 | 16126 | 0.977303733 |
| NLLLoss | bfloat16 | 256 100 | -100 | contiguous | mean | bwd | 15968 | 6881 | 2.320592937 |
| NLLLoss | bfloat16 | 256 100 | -100 | noncontiguous | mean | fwd | 19171 | 15860 | 1.208764187 |
| NLLLoss | bfloat16 | 256 100 | -100 | noncontiguous | mean | bwd | 21891 | 6970 | 3.140746055 |
| NLLLoss | bfloat16 | 256 100 | -100 | contiguous | sum | fwd | 15488 | 16411 | 0.943757236 |
| NLLLoss | bfloat16 | 256 100 | -100 | contiguous | sum | bwd | 14864 | 6934 | 2.143640035 |
| NLLLoss | bfloat16 | 256 100 | -100 | noncontiguous | sum | fwd | 19070 | 15878 | 1.201032876 |
| NLLLoss | bfloat16 | 256 100 | -100 | noncontiguous | sum | bwd | 20670 | 6827 | 3.027684195 |
| NLLLoss | bfloat16 | 40 2 | -100 | contiguous | none | fwd | 8560 | 7307 | 1.171479403 |
| NLLLoss | bfloat16 | 40 2 | -100 | contiguous | none | bwd | 14528 | 7396 | 1.96430503 |
| NLLLoss | bfloat16 | 40 2 | -100 | noncontiguous | none | fwd | 10996 | 7467 | 1.47261283 |
| NLLLoss | bfloat16 | 40 2 | -100 | noncontiguous | none | bwd | 16000 | 7182 | 2.227791701 |
| NLLLoss | bfloat16 | 40 2 | -100 | contiguous | mean | fwd | 8544 | 13973 | 0.611464968 |
| NLLLoss | bfloat16 | 40 2 | -100 | contiguous | mean | bwd | 10416 | 6382 | 1.632090254 |
| NLLLoss | bfloat16 | 40 2 | -100 | noncontiguous | mean | fwd | 11927 | 13102 | 0.910319035 |
| NLLLoss | bfloat16 | 40 2 | -100 | noncontiguous | mean | bwd | 15098 | 6471 | 2.333178798 |
| NLLLoss | bfloat16 | 40 2 | -100 | contiguous | sum | fwd | 6960 | 13600 | 0.511764706 |
| NLLLoss | bfloat16 | 40 2 | -100 | contiguous | sum | bwd | 8944 | 6524 | 1.370938075 |
| NLLLoss | bfloat16 | 40 2 | -100 | noncontiguous | sum | fwd | 11665 | 13529 | 0.862221894 |
| NLLLoss | bfloat16 | 40 2 | -100 | noncontiguous | sum | bwd | 14516 | 6684 | 2.171753441 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | contiguous | none | fwd | 28177 | 14276 | 1.973732138 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | contiguous | none | bwd | 784960 | 14738 | 53.26095807 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | noncontiguous | none | fwd | 2432283 | 14347 | 169.5325155 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | noncontiguous | none | bwd | 3191033 | 14632 | 218.0859076 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | contiguous | mean | fwd | 409576 | 24605 | 16.64604755 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | contiguous | mean | bwd | 1033941 | 12925 | 79.9954352 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | noncontiguous | mean | fwd | 2758091 | 24783 | 111.289634 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | noncontiguous | mean | bwd | 3430700 | 13085 | 262.1857088 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | contiguous | sum | fwd | 408184 | 24516 | 16.64969816 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | contiguous | sum | bwd | 1037012 | 12960 | 80.01635802 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | noncontiguous | sum | fwd | 2758670 | 24641 | 111.9544661 |
| NLLLoss | bfloat16 | 8192 52100 | -100 | noncontiguous | sum | bwd | 3435032 | 12711 | 270.2408937 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | contiguous | none | fwd | 35152 | 20267 | 1.734445157 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | contiguous | none | bwd | 1854038 | 21280 | 87.12584586 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | noncontiguous | none | fwd | 5798155 | 20355 | 284.8516335 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | noncontiguous | none | bwd | 7598346 | 20728 | 366.5740062 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | contiguous | mean | fwd | 998743 | 28462 | 35.09040124 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | contiguous | mean | bwd | 2439899 | 20017 | 121.8913424 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | noncontiguous | mean | fwd | 6649970 | 27786 | 239.3280789 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | noncontiguous | mean | bwd | 8159293 | 18880 | 432.1659428 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | contiguous | sum | fwd | 992261 | 28800 | 34.45350694 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | contiguous | sum | bwd | 3298188 | 19715 | 167.29333 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | noncontiguous | sum | fwd | 6624722 | 28978 | 228.6121195 |
| NLLLoss | bfloat16 | 20480 50000 | -100 | noncontiguous | sum | bwd | 8167933 | 20533 | 397.7954025 |
- Average over all cases:
Contiguous :
| type | Forward | Backward |
|---|---|---|
| float16 | 5.27 | 4.09 |
| float32 | 4.03 | 2.82 |
| bfloat16 | 5.28 | 4.03 |
Non-Contiguous :
| type | Forward | Backward |
|---|---|---|
| float16 | 3.84 | 3.51 |
| float32 | 3.49 | 2.98 |
| bfloat16 | 3.83 | 4.53 |
Reduction:
| type | Forward | Backward |
|---|---|---|
| float16 | 1.66 | 2.00 |
| float32 | 1.71 | 1.97 |
| bfloat16 | 1.66 | 2.03 |
This result does not include some instances where MIOpen significantly outperforms ROCm in cases with a large number of classes:
Input size = [8192 52100] (N, C)
| op_name | dtype | contiguous | reduction | ignore index | direction | ROCm pytorch | MIOpen HIP | Improvement |
|---|---|---|---|---|---|---|---|---|
| NLLLoss | float16 | true | none | -100 | fwd | 28769 | 14437 | 1.99 |
| NLLLoss | float16 | true | none | -100 | bwd | 784551 | 14260 | 55.01 |
| NLLLoss | float16 | false | none | -100 | fwd | 2435181 | 14579 | 167.03 |
| NLLLoss | float16 | false | none | -100 | bwd | 3194068 | 14455 | 220.96 |
| NLLLoss | float16 | false | mean | -100 | fwd | 2759594 | 24642 | 111.98 |
| NLLLoss | float16 | false | mean | -100 | bwd | 3429560 | 12783 | 268.29 |
| NLLLoss | float16 | false | sum | -100 | fwd | 2760448 | 25104 | 109.96 |
| NLLLoss | float16 | false | sum | -100 | bwd | 3429032 | 12961 | 264.56 |
| NLLLoss | float32 | true | none | -100 | fwd | 33745 | 15574 | 2.16 |
| NLLLoss | float32 | true | none | -100 | bwd | 1184589 | 25957 | 45.63 |
| NLLLoss | float32 | false | none | -100 | fwd | 3299934 | 15396 | 214.33 |
| NLLLoss | float32 | false | none | -100 | bwd | 4476791 | 24890 | 179.86 |
| NLLLoss | float32 | false | mean | -100 | fwd | 3642497 | 34135 | 106.70 |
| NLLLoss | float32 | false | mean | -100 | bwd | 4738202 | 14152 | 334.80 |
| NLLLoss | float32 | false | sum | -100 | fwd | 3649286 | 25565 | 142.74 |
| NLLLoss | float32 | false | sum | -100 | bwd | 4741964 | 24428 | 194.12 |
| NLLLoss | bfloat16 | true | none | -100 | fwd | 28177 | 14276 | 1.97 |
| NLLLoss | bfloat16 | true | none | -100 | bwd | 784960 | 14738 | 53.26 |
| NLLLoss | bfloat16 | false | none | -100 | fwd | 2432283 | 14347 | 169.53 |
| NLLLoss | bfloat16 | false | none | -100 | bwd | 3191033 | 14632 | 218.08 |
| NLLLoss | bfloat16 | false | mean | -100 | fwd | 2758091 | 24783 | 111.28 |
| NLLLoss | bfloat16 | false | mean | -100 | bwd | 3430700 | 13085 | 262.18 |
| NLLLoss | bfloat16 | false | sum | -100 | fwd | 2758670 | 24641 | 111.95 |
| NLLLoss | bfloat16 | false | sum | -100 | bwd | 3435032 | 12711 | 270.24 |
Input size = [20480 50000] (N, C)
| op_name | dtype | contiguous | reduction | ignore index | direction | ROCm pytorch | MIOpen HIP | Improvement |
|---|---|---|---|---|---|---|---|---|
| NLLLoss | float16 | true | none | -100 | fwd | 35521 | 20285 | 1.75 |
| NLLLoss | float16 | true | none | -100 | bwd | 1852737 | 21547 | 85.98 |
| NLLLoss | float16 | false | none | -100 | fwd | 5804885 | 19752 | 293.88 |
| NLLLoss | float16 | false | none | -100 | bwd | 7600988 | 21049 | 361.10 |
| NLLLoss | float16 | false | mean | -100 | fwd | 6632306 | 28800 | 230.28 |
| NLLLoss | float16 | false | mean | -100 | bwd | 8166788 | 19449 | 419.90 |
| NLLLoss | float16 | false | sum | -100 | fwd | 6631614 | 27697 | 239.43 |
| NLLLoss | float16 | false | sum | -100 | bwd | 8173236 | 19164 | 426.48 |
| NLLLoss | float32 | true | none | -100 | fwd | 42704 | 23501 | 1.81 |
| NLLLoss | float32 | true | none | -100 | bwd | 2804726 | 29812 | 94.08 |
| NLLLoss | float32 | false | none | -100 | fwd | 7850643 | 23217 | 338.14 |
| NLLLoss | float32 | false | none | -100 | bwd | 10698810 | 23591 | 453.51 |
| NLLLoss | float32 | false | mean | -100 | fwd | 8752699 | 37706 | 232.13 |
| NLLLoss | float32 | false | mean | -100 | bwd | 11695705 | 28764 | 406.60 |
| NLLLoss | float32 | false | sum | -100 | fwd | 8711305 | 32585 | 267.34 |
| NLLLoss | float32 | false | sum | -100 | bwd | 11648068 | 26008 | 447.86 |
| NLLLoss | bfloat16 | true | none | -100 | fwd | 35152 | 20267 | 1.73 |
| NLLLoss | bfloat16 | true | none | -100 | bwd | 1854038 | 21280 | 87.12 |
| NLLLoss | bfloat16 | false | none | -100 | fwd | 5798155 | 20355 | 284.85 |
| NLLLoss | bfloat16 | false | none | -100 | bwd | 7598346 | 20728 | 366.57 |
| NLLLoss | bfloat16 | false | mean | -100 | fwd | 6649970 | 27786 | 239.32 |
| NLLLoss | bfloat16 | false | mean | -100 | bwd | 8159293 | 18880 | 432.16 |
| NLLLoss | bfloat16 | false | sum | -100 | fwd | 6624722 | 28978 | 228.61 |
| NLLLoss | bfloat16 | false | sum | -100 | bwd | 8167933 | 20533 | 397.79 |
Could you check all the comments from https://github.com/ROCm/MIOpen/pull/3143, https://github.com/ROCm/MIOpen/pull/3156 and https://github.com/ROCm/MIOpen/pull/3166 and implement them here too.
Moreover, I suspect that all these 4 PRs are very similar and actually it has to be 1 algorithm with just different loss functions provided to that algorithm. (probably 2 algorithms, but not 4).
@iq136boy Would you send us the build log of this PR?
Greetings @long10024070 & @hieule88!
Looks like a file is missing which is causing the build to fail compilation:
[2024-09-23T13:30:09.658Z] CMake Error at src/CMakeLists.txt:809 (add_library):
[2024-09-23T13:30:09.658Z] Cannot find source file:
[2024-09-23T13:30:09.658Z] solver/nllloss/solver_reduced.cpp
Greetings @long10024070 & @hieule88!
Looks like a file is missing which is causing the build to fail compilation:
[2024-09-23T13:30:09.658Z] CMake Error at src/CMakeLists.txt:809 (add_library): [2024-09-23T13:30:09.658Z] Cannot find source file: [2024-09-23T13:30:09.658Z] solver/nllloss/solver_reduced.cpp
Sorry for this basic error. I have fixed it.
However, this branch is being fixed, so the CI process will continue to fail during static check. Please ignore this PR for now.
CI failed log
Fixed it. Could you please provide me with the CI failure log for the recent test?
I found a bug, need to fix it before re-open
MIOpen is moving to the new monorepo setup and all older unmerged PR's are being closed. Please re-open this as part of the new repo if these changes are still needed.