quanto Errors when applied to Lumina-Next

Errors when applied to Lumina-Next

Open phil329 opened this issue 6 months ago • 1 comments

There is AssertionError when i tried the following codes.

from diffusers import LuminaText2ImgPipeline
from optimum.quanto import freeze, qfloat8, quantize

pipeline = LuminaText2ImgPipeline.from_pretrained("Alpha-VLLM/Lumina-Next-SFT-diffusers", torch_dtype=torch.float16).to("cuda")

quantize(pipeline.transformer, weights=qfloat8)
freeze(pipeline.transformer)

image = pipeline("ghibli style, a fantasy landscape with castles").images[0]

The qbytes_mm needs activations with the shape of 2 or 3. However, it has the ndim=4 during the inference of LuminaText2ImgPipeline

The primary logs are as followings:

The AssertionError happens at the line of code

Aug 04 '24 07:08 phil329

quanto quanto copied to clipboard

Errors when applied to Lumina-Next

quanto
quanto copied to clipboard