stable-fast FP8 support in stable fast

Is it planned?

Currently getting this error when trying to run ComfyUI in fp8 (flags --fp8_e4m3fn-text-enc --fp8_e4m3fn-unet):

RuntimeError: "addmm_cuda" not implemented for 'Float8_e4m3fn'

Feb 25 '24 15:02 jkrauss82

Is it planned?

Currently getting this error when trying to run ComfyUI in fp8 (flags --fp8_e4m3fn-text-enc --fp8_e4m3fn-unet):
RuntimeError: "addmm_cuda" not implemented for 'Float8_e4m3fn'

I'm quite sure stable fast has its own quantization node but it's not implemented in the node iirc

Feb 25 '24 16:02 banjaminicc

@jkrauss82 Sorry, FP8 kernels aren't implemented and I guess I lack the time to support them now.

Feb 26 '24 09:02 chengzeyi

Thanks for the reply, understood. It would be nice if it could be supported eventually.

Feb 27 '24 21:02 jkrauss82

@jkrauss82 I have created one new project which supports FP8 inference with diffusers. However, it has not been open-sourced. I hope it could be made publicly soon...

May 09 '24 14:05 chengzeyi

Is it planned? Currently getting this error when trying to run ComfyUI in fp8 (flags --fp8_e4m3fn-text-enc --fp8_e4m3fn-unet):
RuntimeError: "addmm_cuda" not implemented for 'Float8_e4m3fn'
I'm quite sure stable fast has its own quantization node but it's not implemented in the node iirc

A new project could be published soon to support FP8 inference instead of stable-fast. I hope everyone will enjoy it.

May 09 '24 15:05 chengzeyi

That would be very welcome. I have seen fp8 support is getting traction recently in the vllm project. Would be nice to have it in diffusers/img gen as well. I will stay tuned. Thanks for the update!

May 09 '24 21:05 jkrauss82

stable-fast stable-fast copied to clipboard

FP8 support in stable fast

stable-fast
stable-fast copied to clipboard