quanto
quanto copied to clipboard
Errors when applied to Lumina-Next
There is AssertionError
when i tried the following codes.
from diffusers import LuminaText2ImgPipeline
from optimum.quanto import freeze, qfloat8, quantize
pipeline = LuminaText2ImgPipeline.from_pretrained("Alpha-VLLM/Lumina-Next-SFT-diffusers", torch_dtype=torch.float16).to("cuda")
quantize(pipeline.transformer, weights=qfloat8)
freeze(pipeline.transformer)
image = pipeline("ghibli style, a fantasy landscape with castles").images[0]
The qbytes_mm needs activations with the shape of 2 or 3. However, it has the ndim=4
during the inference of LuminaText2ImgPipeline
The primary logs are as followings:
The AssertionError
happens at the line of code