quanto icon indicating copy to clipboard operation
quanto copied to clipboard

Support for new diffuser: flux1.schnell

Open KoppAlexander opened this issue 6 months ago • 5 comments

@sayakpaul

Hello,

I am looking for support for saving and loading the flux1.schnell model from Blackforest.

Following your code from the "Bonus" here

Saving

from diffusers import PixArtTransformer2DModel
from optimum.quanto import QuantizedPixArtTransformer2DModel, qfloat8

model = PixArtTransformer2DModel.from_pretrained("PixArt-alpha/PixArt-Sigma-XL-2-1024-MS", subfolder="transformer")
qmodel = QuantizedPixArtTransformer2DModel.quantize(model, weights=qfloat8)
qmodel.save_pretrained("pixart-sigma-fp8")

Loading

from optimum.quanto import QuantizedPixArtTransformer2DModel
import torch

transformer = QuantizedPixArtTransformer2DModel.from_pretrained("pixart-sigma-fp8") 
transformer.to(device="cuda", dtype=torch.float16)

I am looking for a similar option to save and load the two quantized models in this repo here,

see line 38,45,46

transformer = FluxTransformer2DModel.from_pretrained(bfl_repo, subfolder="transformer", torch_dtype=dtype, revision=revision)
quantize(transformer, weights=qfloat8)
freeze(transformer)

and 35,48,49.

text_encoder_2 = T5EncoderModel.from_pretrained(bfl_repo, subfolder="text_encoder_2", torch_dtype=dtype, revision=revision)
quantize(text_encoder_2, weights=qfloat8)
freeze(text_encoder_2)

If I'm correctly understanding this, there would need to be somethin like

from optimum.quanto import QuantizedFluxTransformer2DModel for the transformer and from optimum.quanto import T5EncoderModel for the text_encoder_2 to be able to save and load the quantized models for the transformer and encoder. Is that correct? Or is tehre another possibility which avoids importing the quantized version of a model from optimum.quanto?

Thank you!

@sayakpaul

KoppAlexander avatar Aug 07 '24 14:08 KoppAlexander