oneDNN
oneDNN copied to clipboard
[nvidia] int8 convolution primitive fails correctness check
Summary
oneDNN validation for Nvidia backend hits a correctness issue under benchdnn on int8 convolution problems with dst scale set.
Steps to reproduce
Build
mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=release (or debug) -DDNNL_CPU_RUNTIME=DPCPP (or NONE) -DDNNL_GPU_RUNTIME=DPCPP -DDNNL_GPU_VENDOR=NVIDIA -DONEDNN_BUILD_GRAPH=OFF
cmake --build . --target benchdnn
Run
benchdnn --conv --engine=gpu --dir=FWD_I --dt=s8:s8:s8 --attr-scales=dst:common:2 g4ic20ih5oc20oh5kh3ph1n"2d_tail_conv:grouped"
Observed behavior
Failures are reproducible within a single run.
run: --mode-modifier=P --conv --engine=gpu --dir=FWD_I --dt=s8:s8:s8 --attr-scales=dst:common:2 g4ic20ih5oc20oh5kh3ph1n"2d_tail_conv:grouped"
[ 91][DST][0:3:3:1] exp_f32: -74 exp: -74 got: -64 diff: 10 rdiff:0.135135
[ 289][DST][0:11:2:4] exp_f32: -68 exp: -68 got: -64 diff: 4 rdiff:0.0588235
[ 736][DST][1:9:2:1] exp_f32: 79 exp: 79 got: 64 diff: 15 rdiff:0.189873
[ 813][DST][1:12:2:3] exp_f32: -71 exp: -71 got: -64 diff: 7 rdiff:0.0985916
[ 858][DST][1:14:1:3] exp_f32: -68 exp: -68 got: -64 diff: 4 rdiff:0.0588235
[COMPARE_STATS][DST]: trh=0 max_diff: 15 max_rdiff:0.189873
2207:FAILED (errors:5 total:1000) __REPRO: --mode-modifier=P --conv --engine=gpu --dir=FWD_I --dt=s8:s8:s8 --attr-scales=dst:common:2 g4ic20ih5oc20oh5kh3ph1n"2d_tail_conv:grouped"
The reason for failures is incorrect scale handling. Instead of applying it over f32 output coming from accumulation, it down converts the output value into s8 first. Such conversion saturates the output and only then applies the scale which leads to a mismatched result.