iree [Flow][SDXL] Numerics different with vs. without aggressive fusion on SDXL

[Flow][SDXL] Numerics different with vs. without aggressive fusion on SDXL

Open Max191 opened this issue 6 months ago • 7 comments

Running SDXL int8 with aggressive fusion enabled produces different results from running without aggressive fusion enabled.

Repro Instructions

Checkout https://github.com/iree-org/iree/tree/shared/sdxl_quantized in IREE
Clone https://github.com/nod-ai/sdxl-scripts and cd sdxl-scripts/int8-model
run ./compile-punet.sh gfx942 and

iree-run-module   --device=hip://0   --hip_use_streams=true   --hip_allow_inline_execution=true   --device_allocator=caching   --module=tmp/punet.vmfb   --parameters=model=/data/shark/sdxl_unet_int8_dataset.irpa   --function=main   --input=1x4x128x128xf16=1.0   --input=1xsi32=1   --input=2x64x2048xf16=1.0   --input=2x1280xf16=1.0   --input=2x6xf16=1.0   --input=1xf16=1.0 --output=@out_default.npy

Remove --iree-flow-enable-aggressive-fusion from compile-punet-base.sh
run ./compile-punet.sh gfx942 and

iree-run-module   --device=hip://0   --hip_use_streams=true   --hip_allow_inline_execution=true   --device_allocator=caching   --module=tmp/punet.vmfb   --parameters=model=/data/shark/sdxl_unet_int8_dataset.irpa   --function=main   --input=1x4x128x128xf16=1.0   --input=1xsi32=1   --input=2x64x2048xf16=1.0   --input=2x1280xf16=1.0   --input=2x6xf16=1.0   --input=1xf16=1.0 --output=@out_no_aggressive_fusion.npy

Compare the results in out_default.npy vs. out_no_aggressive_fusion.npy:

import numpy as np

a= np.load("out_default.npy")
b= np.load("out_no_aggressive_fusion.npy")

diff = a- b

print(diff)
print(np.max(diff))

Max diff between output tensors is 0.2993

Aug 08 '24 16:08 Max191

iree iree copied to clipboard

[Flow][SDXL] Numerics different with vs. without aggressive fusion on SDXL

Repro Instructions

iree
iree copied to clipboard