oneDNN
oneDNN copied to clipboard
[nvidia] int8 convolution with s8 dst primitive and a sum post op fails correctness check
Summary
oneDNN validation for Nvidia backend hits a correctness issue under benchdnn on int8 convolution (dst is s8) problems with sum post op.
Build
mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=release (or debug) -DDNNL_CPU_RUNTIME=DPCPP (or NONE) -DDNNL_GPU_RUNTIME=DPCPP -DDNNL_GPU_VENDOR=NVIDIA -DONEDNN_BUILD_GRAPH=OFF
cmake --build . --target benchdnn
Run
$ ./tests/benchdnn/benchdnn --conv --engine=gpu --skip-impl=ref --dir=FWD_D --dt=s8:s8:s8 --attr-post-ops=sum:1 g1ic32ih149oc32oh147kh3ph0n"googlenet_v3:conv_1_1_conv2d"
Observed behavior
$ ./tests/benchdnn/benchdnn --conv --engine=gpu --skip-impl=ref --dir=FWD_D --dt=s8:s8:s8 --attr-post-ops=sum:1 g1ic32ih149oc32oh147kh3ph0n"googlenet_v3:conv_1_1_conv2d"
[ 140][DST][0:0:0:140] exp_f32: 129 exp: 127 got: 124 diff: 3 rdiff:0.023622
[ 186][DST][0:0:1:39] exp_f32: -136 exp: -128 got: -124 diff: 4 rdiff: 0.03125
[ 363][DST][0:0:2:69] exp_f32: -131 exp: -128 got: -127 diff: 1 rdiff:0.0078125
[ 574][DST][0:0:3:133] exp_f32: -156 exp: -128 got: -126 diff: 2 rdiff:0.015625
[ 607][DST][0:0:4:19] exp_f32: -148 exp: -128 got: -126 diff: 2 rdiff:0.015625
[ 894][DST][0:0:6:12] exp_f32: -129 exp: -128 got: -125 diff: 3 rdiff:0.0234375
[1188][DST][0:0:8:12] exp_f32: 151 exp: 127 got: 124 diff: 3 rdiff:0.023622
[1551][DST][0:0:10:81] exp_f32: 144 exp: 127 got: 125 diff: 2 rdiff:0.015748
[1680][DST][0:0:11:63] exp_f32: -158 exp: -128 got: -124 diff: 4 rdiff: 0.03125
[1716][DST][0:0:11:99] exp_f32: -135 exp: -128 got: -127 diff: 1 rdiff:0.0078125
[COMPARE_STATS][DST]: trh=0 err_max_diff: 4 err_max_rdiff:0.0314961 all_max_diff: 4 all_max_rdiff:0.0314961
0:FAILED (errors:7570 total:1382976) __REPRO: --conv --engine=gpu --skip-impl=ref --dir=FWD_D --dt=s8:s8:s8 --attr-post-ops=sum g1ic32ih149oc32oh147kh3ph0ngooglenet_v3:conv_1_1_conv2d
tests:1 passed:0 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:1 listed:0
total: 5.35s; fill: 0.05s (1%); compute_ref: 0.04s (1%); compare: 0.06s (1%);