MIOpen
MIOpen copied to clipboard
Implement PadConstant
- Add
PadConstantoperation [ref] with forward and backward kernels. - Add driver and gtest for kernels.
- Performance condition:
- MIOpen is faster if inputs and outputs are all non-contiguous (for both fwd and bwd).
- For
PadConstantFwd: MIOpen is faster ifndimensions isn't padded. - For
PadConstantBwd: MIOpen is faster if not the casendimension is the only one padded.
Average improvement over ROCm
| type | fwd | bwd |
|---|---|---|
| float | 1.21 | 1.58 |
| float16 | 1.16 | 1.53 |
| bfloat 16 | 1.15 | 1.54 |
Detail Benchmark
float32
| op_name | dtype | size (pytorch format) | cont | direction | rocm_kernel_avg | mio_kernel_duration | miopen_kern | rocm/mio kern |
|---|---|---|---|---|---|---|---|---|
| PadConstant | float32 | [4 4 4 4 4]+[0 0 0 0 0 0 0 0 2 2]:42.0 | noncontiguous | fwd | 9792 | 7359 | padconstant | 1.33061557276804 |
| PadConstant | float32 | [4 4 4 4 4]+[0 0 0 0 0 0 1 1 1 1]:42.0 | noncontiguous | fwd | 9664 | 7519 | padconstant | 1.28527729751297 |
| PadConstant | float32 | [4 4 4 4 4]+[0 0 0 0 0 0 0 0 1 1]:42.0 | noncontiguous | fwd | 9712 | 7661 | padconstant | 1.26771961884871 |
| PadConstant | float32 | [8 8 4 4 4]+[0 0 0 0 0 0 0 0 1 1]:42.0 | noncontiguous | fwd | 11264 | 7928 | padconstant | 1.42078708375378 |
| PadConstant | float32 | [4 4 4 4 4]+[0 0 0 0 0 0 2 2 2 2]:42.0 | noncontiguous | fwd | 9744 | 6968 | padconstant | 1.398392652124 |
| PadConstant | float32 | [8 8 4 4 4]+[0 0 0 0 0 0 1 1 1 1]:42.0 | noncontiguous | fwd | 11584 | 7768 | padconstant | 1.49124613800206 |
| PadConstant | float32 | [8 8 4 4 4]+[0 0 0 0 0 0 0 0 2 2]:42.0 | noncontiguous | fwd | 11296 | 7839 | padconstant | 1.44100012756729 |
| PadConstant | float32 | [4 4 4 4 4]+[0 0 0 0 1 1 1 1 1 1]:42.0 | noncontiguous | fwd | 9072 | 7270 | padconstant | 1.24786795048143 |
| PadConstant | float32 | [4 4 1 8 8]+[0 0 0 0 0 0 1 1 1 1]:42.0 | noncontiguous | fwd | 9248 | 7715 | padconstant | 1.19870382372003 |
| PadConstant | float32 | [4 4 1 8 8]+[0 0 0 0 0 0 0 0 4 4]:42.0 | noncontiguous | fwd | 9232 | 7413 | padconstant | 1.24537973829759 |
| PadConstant | float32 | [4 4 1 8 8]+[0 0 0 0 0 0 0 0 1 1]:42.0 | noncontiguous | fwd | 9088 | 7466 | padconstant | 1.2172515403161 |
| PadConstant | float32 | [4 4 1 8 8]+[0 0 0 0 0 0 0 0 2 2]:42.0 | noncontiguous | fwd | 9248 | 7217 | padconstant | 1.28141887210752 |
| PadConstant | float32 | [4 4 1 1 32]+[0 0 0 0 0 0 0 0 4 4]:42.0 | noncontiguous | fwd | 8384 | 7182 | padconstant | 1.16736285157338 |
| PadConstant | float32 | [4 4 1 16 16]+[0 0 0 0 0 0 0 0 1 1]:42.0 | noncontiguous | fwd | 11088 | 7537 | padconstant | 1.47114236433594 |
| PadConstant | float32 | [4 4 1 1 32]+[0 0 0 0 0 0 2 2 2 2]:42.0 | noncontiguous | fwd | 8128 | 7537 | padconstant | 1.07841316173544 |
| PadConstant | float32 | [4 4 1 1 32]+[0 0 0 0 0 0 0 0 1 1]:42.0 | noncontiguous | fwd | 8160 | 7733 | padconstant | 1.05521789732316 |
| PadConstant | float32 | [4 4 1 1 32]+[0 0 0 0 0 0 1 1 1 1]:42.0 | noncontiguous | fwd | 8112 | 7253 | padconstant | 1.11843375155108 |
| PadConstant | float32 | [4 4 1 1 64]+[0 0 0 0 0 0 0 0 1 1]:42.0 | noncontiguous | fwd | 8800 | 7448 | padconstant | 1.18152524167562 |
| PadConstant | float32 | [4 4 1 1 64]+[0 0 0 0 0 0 1 1 1 1]:42.0 | noncontiguous | fwd | 8752 | 7128 | padconstant | 1.22783389450056 |
| PadConstant | float32 | [4 4 1 1 64]+[0 0 0 0 0 0 0 0 4 4]:42.0 | noncontiguous | fwd | 8800 | 7555 | padconstant | 1.16479152878888 |
| PadConstant | float32 | [4 4 1 1 64]+[4 4 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13456 | 8746 | padconstant_bwd | 1.53853190029728 |
| PadConstant | float32 | [4 4 1 1 64]+[2 2 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13408 | 8231 | padconstant_bwd | 1.62896367391568 |
| PadConstant | float32 | [4 4 1 1 32]+[2 2 2 2 2 2 2 2 0 0]:42.0 | noncontiguous | bwd | 13056 | 8302 | padconstant_bwd | 1.57263310045772 |
| PadConstant | float32 | [4 4 1 1 32]+[4 4 4 4 4 4 0 0 0 0]:42.0 | noncontiguous | bwd | 13328 | 8710 | padconstant_bwd | 1.53019517795637 |
| PadConstant | float32 | [4 4 4 4 4]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13712 | 8977 | padconstant_bwd | 1.52745906204745 |
| PadConstant | float32 | [4 4 1 1 64]+[2 2 2 2 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13168 | 7964 | padconstant_bwd | 1.65344048216976 |
| PadConstant | float32 | [4 4 1 1 32]+[4 4 4 4 4 4 4 4 0 0]:42.0 | noncontiguous | bwd | 13600 | 9137 | padconstant_bwd | 1.48845354054941 |
| PadConstant | float32 | [4 4 1 1 64]+[2 2 2 2 2 2 2 2 0 0]:42.0 | noncontiguous | bwd | 13744 | 8373 | padconstant_bwd | 1.64146661889406 |
| PadConstant | float32 | [4 4 1 8 8]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13248 | 8355 | padconstant_bwd | 1.58563734290844 |
| PadConstant | float32 | [8 8 4 4 4]+[4 4 4 4 4 4 0 0 0 0]:42.0 | noncontiguous | bwd | 14768 | 8906 | padconstant_bwd | 1.65820794969683 |
| PadConstant | float32 | [4 4 4 4 4]+[4 4 4 4 4 4 0 0 0 0]:42.0 | noncontiguous | bwd | 13968 | 9368 | padconstant_bwd | 1.49103330486763 |
| PadConstant | float32 | [4 4 1 16 16]+[4 4 4 4 4 4 4 4 0 0]:42.0 | noncontiguous | bwd | 13552 | 8248 | padconstant_bwd | 1.64306498545102 |
| PadConstant | float32 | [4 4 1 1 64]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13296 | 8355 | padconstant_bwd | 1.59138240574506 |
| PadConstant | float32 | [8 8 4 4 4]+[1 1 1 1 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 14848 | 9066 | padconstant_bwd | 1.63776748290315 |
| PadConstant | float32 | [4 4 1 1 64]+[4 4 4 4 4 4 4 4 0 0]:42.0 | noncontiguous | bwd | 13888 | 8266 | padconstant_bwd | 1.68013549479797 |
| PadConstant | float32 | [4 4 1 8 8]+[4 4 4 4 4 4 0 0 0 0]:42.0 | noncontiguous | bwd | 13440 | 8621 | padconstant_bwd | 1.55898387658044 |
| PadConstant | float32 | [4 4 1 1 128]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 13840 | 8177 | padconstant_bwd | 1.69255228078757 |
| PadConstant | float32 | [8 8 4 4 4]+[2 2 2 2 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 14560 | 8942 | padconstant_bwd | 1.62827108029524 |
| PadConstant | float32 | [4 4 4 4 4]+[4 4 4 4 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 14528 | 9279 | padconstant_bwd | 1.56568595753853 |
| PadConstant | float32 | [256 256 4 4 4]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 321376 | 83785 | padconstant_bwd | 3.83572238467506 |
float16
| op_name | dtype | size (pytorch format) | cont | direction | rocm_kernel_avg | mio_kernel_duration | miopen_kern | rocm/mio kern |
|---|---|---|---|---|---|---|---|---|
| PadConstant | float16 | [4 4 1 1 32]+[0 0 0 0 0 0 1 1 1 1]:42.0 | noncontiguous | fwd | 8592 | 7466 | padconstantfp16 | 1.15081703723547 |
| PadConstant | float16 | [4 4 1 1 64]+[0 0 0 0 0 0 1 1 1 1]:42.0 | noncontiguous | fwd | 8944 | 7039 | padconstantfp16 | 1.27063503338542 |
| PadConstant | float16 | [4 4 1 1 32]+[0 0 0 0 0 0 2 2 2 2]:42.0 | noncontiguous | fwd | 8512 | 6826 | padconstantfp16 | 1.24699677702901 |
| PadConstant | float16 | [4 4 1 16 16]+[0 0 0 0 0 0 2 2 2 2]:42.0 | noncontiguous | fwd | 11472 | 7306 | padconstantfp16 | 1.57021626060772 |
| PadConstant | float16 | [4 4 1 8 8]+[0 0 0 0 0 0 2 2 2 2]:42.0 | noncontiguous | fwd | 8736 | 6523 | padconstantfp16 | 1.33926107619194 |
| PadConstant | float16 | [4 4 1 1 32]+[0 0 0 0 0 0 0 0 4 4]:42.0 | noncontiguous | fwd | 8624 | 6950 | padconstantfp16 | 1.24086330935252 |
| PadConstant | float16 | [4 4 1 1 64]+[0 0 0 0 0 0 0 0 2 2]:42.0 | noncontiguous | fwd | 8928 | 7306 | padconstantfp16 | 1.22200930741856 |
| PadConstant | float16 | [4 4 1 8 8]+[0 0 0 0 0 0 0 0 4 4]:42.0 | noncontiguous | fwd | 8848 | 7324 | padconstantfp16 | 1.20808301474604 |
| PadConstant | float16 | [4 4 1 1 32]+[0 0 0 0 0 0 0 0 2 2]:42.0 | noncontiguous | fwd | 8480 | 7288 | padconstantfp16 | 1.1635565312843 |
| PadConstant | float16 | [4 4 1 8 8]+[0 0 0 0 0 0 0 0 1 1]:42.0 | noncontiguous | fwd | 8688 | 7217 | padconstantfp16 | 1.20382430372731 |
| PadConstant | float16 | [4 4 1 1 64]+[0 0 0 0 0 0 0 0 1 1]:42.0 | noncontiguous | fwd | 8976 | 7466 | padconstantfp16 | 1.2022502009108 |
| PadConstant | float16 | [4 4 1 8 8]+[0 0 0 0 0 0 1 1 1 1]:42.0 | noncontiguous | fwd | 8688 | 7039 | padconstantfp16 | 1.23426623099872 |
| PadConstant | float16 | [4 4 4 4 4]+[0 0 0 0 0 0 0 0 2 2]:42.0 | noncontiguous | fwd | 8496 | 7518 | padconstantfp16 | 1.13008778930567 |
| PadConstant | float16 | [16 16 1 1 4]+[0 0 0 0 0 0 0 0 2 2]:42.0 | noncontiguous | fwd | 8624 | 7483 | padconstantfp16 | 1.15247895229186 |
| PadConstant | float16 | [4 4 1 8 8]+[0 0 0 0 0 0 4 4 4 4]:42.0 | noncontiguous | fwd | 8656 | 6275 | padconstantfp16 | 1.3794422310757 |
| PadConstant | float16 | [32 32 1 1 1]+[0 0 0 0 0 0 1 1 1 1]:42.0 | noncontiguous | fwd | 8640 | 7964 | padconstantfp16 | 1.08488196885987 |
| PadConstant | float16 | [4 4 1 1 32]+[0 0 0 0 0 0 0 0 1 1]:42.0 | noncontiguous | fwd | 8448 | 7537 | padconstantfp16 | 1.12087037282739 |
| PadConstant | float16 | [16 16 1 1 4]+[0 0 0 0 0 0 0 0 4 4]:42.0 | noncontiguous | fwd | 8640 | 7448 | padconstantfp16 | 1.16004296455424 |
| PadConstant | float16 | [16 16 1 1 4]+[0 0 0 0 0 0 1 1 1 1]:42.0 | noncontiguous | fwd | 8576 | 7164 | padconstantfp16 | 1.19709659408152 |
| PadConstant | float16 | [4 4 1 8 8]+[0 0 0 0 1 1 1 1 1 1]:42.0 | noncontiguous | fwd | 8240 | 6399 | padconstantfp16 | 1.28770120331302 |
| PadConstant | float16 | [4 4 1 1 32]+[2 2 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13184 | 8070 | padconstantfp16_bwd | 1.63370508054523 |
| PadConstant | float16 | [4 4 1 16 16]+[2 2 2 2 2 2 0 0 0 0]:42.0 | noncontiguous | bwd | 13968 | 8159 | padconstantfp16_bwd | 1.71197450667974 |
| PadConstant | float16 | [4 4 1 1 64]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13712 | 7964 | padconstantfp16_bwd | 1.72174786539427 |
| PadConstant | float16 | [4 4 1 1 32]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13264 | 7893 | padconstantfp16_bwd | 1.68047637146839 |
| PadConstant | float16 | [4 4 1 1 32]+[4 4 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13136 | 8302 | padconstantfp16_bwd | 1.58226933269092 |
| PadConstant | float16 | [4 4 1 1 128]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 14272 | 7786 | padconstantfp16_bwd | 1.83303365014128 |
| PadConstant | float16 | [4 4 1 1 128]+[4 4 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 14048 | 8337 | padconstantfp16_bwd | 1.6850185918196 |
| PadConstant | float16 | [4 4 1 1 32]+[2 2 2 2 2 2 2 2 0 0]:42.0 | noncontiguous | bwd | 13872 | 8426 | padconstantfp16_bwd | 1.64633277949205 |
| PadConstant | float16 | [4 4 1 1 64]+[4 4 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13408 | 8035 | padconstantfp16_bwd | 1.66869943995022 |
| PadConstant | float16 | [4 4 1 1 128]+[2 2 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13904 | 8284 | padconstantfp16_bwd | 1.67841622404635 |
| PadConstant | float16 | [4 4 1 1 32]+[1 1 1 1 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13616 | 8568 | padconstantfp16_bwd | 1.58916900093371 |
| PadConstant | float16 | [4 4 1 1 128]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 14256 | 8266 | padconstantfp16_bwd | 1.72465521413017 |
| PadConstant | float16 | [4 4 1 8 8]+[2 2 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 12864 | 8337 | padconstantfp16_bwd | 1.54300107952501 |
| PadConstant | float16 | [4 4 1 1 64]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 13648 | 8124 | padconstantfp16_bwd | 1.67996061053668 |
| PadConstant | float16 | [4 4 1 8 8]+[4 4 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13152 | 8195 | padconstantfp16_bwd | 1.60488102501525 |
| PadConstant | float16 | [4 4 1 1 64]+[4 4 4 4 4 4 4 4 0 0]:42.0 | noncontiguous | bwd | 13776 | 8124 | padconstantfp16_bwd | 1.69571639586411 |
| PadConstant | float16 | [16 16 1 1 4]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 13440 | 8177 | padconstantfp16_bwd | 1.64363458481106 |
| PadConstant | float16 | [4 4 1 1 32]+[4 4 4 4 4 4 4 4 0 0]:42.0 | noncontiguous | bwd | 14016 | 8782 | padconstantfp16_bwd | 1.59599180141198 |
| PadConstant | float16 | [4 4 4 16 16]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 14880 | 9101 | padconstantfp16_bwd | 1.63498516646522 |
| PadConstant | float16 | [4 4 4 8 8]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 14704 | 8959 | padconstantfp16_bwd | 1.64125460430852 |
bfloat16
| op_name | dtype | size (pytorch format) | cont | direction | rocm_kernel_avg | kernel_duration | mio_kernel | rocm/mio kern |
|---|---|---|---|---|---|---|---|---|
| PadConstant | bfloat16 | [8 8 4 4 4]+[4 4 4 4 4 4 0 0 0 0]:42.0 | noncontiguous | fwd | 12849 | 7431 | padconstantbfp16 | 1.72910779168349 |
| PadConstant | bfloat16 | [4 4 1 256 256]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 42338 | 24658 | padconstantbfp16 | 1.71700867872496 |
| PadConstant | bfloat16 | [4 4 1 256 256]+[1 1 1 1 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 42210 | 24871 | padconstantbfp16 | 1.69715733183225 |
| PadConstant | bfloat16 | [4 4 4 4 4]+[4 4 4 4 4 4 0 0 0 0]:42.0 | noncontiguous | fwd | 9921 | 5867 | padconstantbfp16 | 1.69098346684847 |
| PadConstant | bfloat16 | [4 4 1 256 256]+[4 4 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 42258 | 25191 | padconstantbfp16 | 1.67750387042992 |
| PadConstant | bfloat16 | [4 4 1 256 256]+[4 4 4 4 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 42962 | 25618 | padconstantbfp16 | 1.67702396752284 |
| PadConstant | bfloat16 | [4 4 4 16 16]+[1 1 1 1 1 1 0 0 0 0]:42.0 | noncontiguous | fwd | 12401 | 7431 | padconstantbfp16 | 1.66881980890863 |
| PadConstant | bfloat16 | [16 16 4 4 4]+[1 1 1 1 1 1 0 0 0 0]:42.0 | noncontiguous | fwd | 12449 | 7467 | padconstantbfp16 | 1.66720235703763 |
| PadConstant | bfloat16 | [64 64 1 1 4]+[4 4 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 12129 | 7324 | padconstantbfp16 | 1.65606226105953 |
| PadConstant | bfloat16 | [4 4 4 8 8]+[4 4 4 4 4 4 0 0 0 0]:42.0 | noncontiguous | fwd | 12033 | 7289 | padconstantbfp16 | 1.65084373713815 |
| PadConstant | bfloat16 | [4 4 4 128 128]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 40771 | 24712 | padconstantbfp16 | 1.64984622855293 |
| PadConstant | bfloat16 | [4 4 4 16 16]+[4 4 4 4 4 4 0 0 0 0]:42.0 | noncontiguous | fwd | 12225 | 7413 | padconstantbfp16 | 1.64912990692028 |
| PadConstant | bfloat16 | [64 64 1 1 4]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 12001 | 7306 | padconstantbfp16 | 1.64262250205311 |
| PadConstant | bfloat16 | [4 4 4 256 256]+[2 2 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 136777 | 83310 | padconstantbfp16 | 1.64178369943584 |
| PadConstant | bfloat16 | [4 4 4 16 16]+[2 2 2 2 2 2 0 0 0 0]:42.0 | noncontiguous | fwd | 12113 | 7431 | padconstantbfp16 | 1.63006324855336 |
| PadConstant | bfloat16 | [8 8 4 4 4]+[1 1 1 1 1 1 0 0 0 0]:42.0 | noncontiguous | fwd | 12304 | 7573 | padconstantbfp16 | 1.62471939786082 |
| PadConstant | bfloat16 | [4 4 4 128 128]+[1 1 1 1 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 40626 | 25032 | padconstantbfp16 | 1.62296260786194 |
| PadConstant | bfloat16 | [4 4 4 128 128]+[2 2 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 40755 | 25121 | padconstantbfp16 | 1.62234783647148 |
| PadConstant | bfloat16 | [4 4 4 256 256]+[4 4 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 136857 | 84447 | padconstantbfp16 | 1.62062595474084 |
| PadConstant | bfloat16 | [4 4 4 256 256]+[2 2 2 2 0 0 0 0 0 0]:42.0 | noncontiguous | fwd | 136329 | 84252 | padconstantbfp16 | 1.61810995584675 |
| PadConstant | bfloat16 | [4 4 4 32 32]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 17249 | 8942 | padconstantbfp16_bwd | 1.92898680384701 |
| PadConstant | bfloat16 | [256 256 4 4 4]+[2 2 2 2 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 160906 | 83649 | padconstantbfp16_bwd | 1.92358545828402 |
| PadConstant | bfloat16 | [32 32 4 4 4]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 17170 | 8942 | padconstantbfp16_bwd | 1.92015209125475 |
| PadConstant | bfloat16 | [4 4 4 32 32]+[4 4 4 4 4 4 4 4 0 0]:42.0 | noncontiguous | bwd | 16929 | 8818 | padconstantbfp16_bwd | 1.91982308913586 |
| PadConstant | bfloat16 | [16 16 4 4 4]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 16481 | 8587 | padconstantbfp16_bwd | 1.9192966111564 |
| PadConstant | bfloat16 | [256 256 1 4 4]+[2 2 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 48659 | 25423 | padconstantbfp16_bwd | 1.91397553396531 |
| PadConstant | bfloat16 | [256 256 4 4 4]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 159642 | 83435 | padconstantbfp16_bwd | 1.91336968897945 |
| PadConstant | bfloat16 | [256 256 1 4 4]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 48275 | 25280 | padconstantbfp16_bwd | 1.90961234177215 |
| PadConstant | bfloat16 | [256 256 4 4 4]+[2 2 2 2 2 2 0 0 0 0]:42.0 | noncontiguous | bwd | 159771 | 83684 | padconstantbfp16_bwd | 1.90921801061135 |
| PadConstant | bfloat16 | [128 128 4 4 4]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 49236 | 25956 | padconstantbfp16_bwd | 1.89690245030051 |
| PadConstant | bfloat16 | [8 8 4 4 4]+[4 4 4 4 4 4 4 4 0 0]:42.0 | noncontiguous | bwd | 16673 | 8800 | padconstantbfp16_bwd | 1.89465909090909 |
| PadConstant | bfloat16 | [16 16 1 1 4]+[2 2 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 15505 | 8213 | padconstantbfp16_bwd | 1.88786070863266 |
| PadConstant | bfloat16 | [4 4 1 1 128]+[1 1 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 15393 | 8160 | padconstantbfp16_bwd | 1.88639705882353 |
| PadConstant | bfloat16 | [128 128 4 4 4]+[2 2 2 2 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 50196 | 26632 | padconstantbfp16_bwd | 1.88480024031241 |
| PadConstant | bfloat16 | [4 4 4 32 32]+[2 2 2 2 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 16529 | 8800 | padconstantbfp16_bwd | 1.87829545454545 |
| PadConstant | bfloat16 | [128 128 4 4 4]+[2 2 2 2 2 2 0 0 0 0]:42.0 | noncontiguous | bwd | 49443 | 26437 | padconstantbfp16_bwd | 1.87021976774974 |
| PadConstant | bfloat16 | [256 256 4 4 4]+[4 4 0 0 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 155722 | 83275 | padconstantbfp16_bwd | 1.86997298108676 |
| PadConstant | bfloat16 | [4 4 4 128 128]+[1 1 1 1 1 1 1 1 0 0]:42.0 | noncontiguous | bwd | 46403 | 24818 | padconstantbfp16_bwd | 1.86973164638569 |
| PadConstant | bfloat16 | [4 4 4 128 128]+[2 2 2 2 2 2 2 2 0 0]:42.0 | noncontiguous | bwd | 46227 | 24747 | padconstantbfp16_bwd | 1.86798399806037 |
| PadConstant | bfloat16 | [16 16 4 4 4]+[4 4 4 4 0 0 0 0 0 0]:42.0 | noncontiguous | bwd | 16129 | 8640 | padconstantbfp16_bwd | 1.86678240740741 |
TODO: Add solver register padconstant to registry src/solver.cpp
MIOpen is moving to the new monorepo setup and all older unmerged PR's are being closed. Please re-open this as part of the new repo if these changes are still needed.