iree
iree copied to clipboard
[Flow][SDXL] Numerics different with vs. without aggressive fusion on SDXL
Running SDXL int8 with aggressive fusion enabled produces different results from running without aggressive fusion enabled.
Repro Instructions
- Checkout https://github.com/iree-org/iree/tree/shared/sdxl_quantized in IREE
- Clone https://github.com/nod-ai/sdxl-scripts and
cd sdxl-scripts/int8-model
- run
./compile-punet.sh gfx942
and
iree-run-module --device=hip://0 --hip_use_streams=true --hip_allow_inline_execution=true --device_allocator=caching --module=tmp/punet.vmfb --parameters=model=/data/shark/sdxl_unet_int8_dataset.irpa --function=main --input=1x4x128x128xf16=1.0 --input=1xsi32=1 --input=2x64x2048xf16=1.0 --input=2x1280xf16=1.0 --input=2x6xf16=1.0 --input=1xf16=1.0 --output=@out_default.npy
- Remove
--iree-flow-enable-aggressive-fusion
fromcompile-punet-base.sh
- run
./compile-punet.sh gfx942
and
iree-run-module --device=hip://0 --hip_use_streams=true --hip_allow_inline_execution=true --device_allocator=caching --module=tmp/punet.vmfb --parameters=model=/data/shark/sdxl_unet_int8_dataset.irpa --function=main --input=1x4x128x128xf16=1.0 --input=1xsi32=1 --input=2x64x2048xf16=1.0 --input=2x1280xf16=1.0 --input=2x6xf16=1.0 --input=1xf16=1.0 --output=@out_no_aggressive_fusion.npy
- Compare the results in
out_default.npy
vs.out_no_aggressive_fusion.npy
:
import numpy as np
a= np.load("out_default.npy")
b= np.load("out_no_aggressive_fusion.npy")
diff = a- b
print(diff)
print(np.max(diff))
Max diff between output tensors is 0.2993