how to quantization wan 2.2 vace after loading lora?

Open chaowenguo opened this issue 1 month ago • 1 comments

diffusers.WanVACEPipeline.from_pretrained('linoyts/Wan2.2-VACE-Fun-14B-diffusers', vae=diffusers.AutoencoderKLWan.from_pretrained('linoyts/Wan2.2-VACE-Fun-14B-diffusers', subfolder='vae', torch_dtype=torch.float32), torch_dtype=torch.bfloat16, quantization_config=diffusers.PipelineQuantizationConfig(quant_backend='bitsandbytes_8bit', quant_kwargs={'load_in_8bit':True}, components_to_quantize=['transformer', 'transformer_2'])).save_pretrained('wan')

normally I can save the quantization model in this way But now I want to merge lora and the quantization and then save the model with lora. How?

wan = diffusers.WanVACEPipeline.from_pretrained('linoyts/Wan2.2-VACE-Fun-14B-diffusers', vae=diffusers.AutoencoderKLWan.from_pretrained('linoyts/Wan2.2-VACE-Fun-14B-diffusers', subfolder='vae', torch_dtype=torch.float32), torch_dtype=torch.bfloat16)
wan.load_lora_weights('lightx2v/Wan2.2-Lightning', weight_name='Wan2.2-I2V-A14B-4steps-lora-rank64-Seko-V1/high_noise_model.safetensors', adapter_name='lightning')
wan.load_lora_weights('lightx2v/Wan2.2-Lightning', weight_name='Wan2.2-I2V-A14B-4steps-lora-rank64-Seko-V1/low_noise_model.safetensors', adapter_name='lightning_2', load_into_transformer_2=True)
wan.set_adapters(['lightning', 'lightning_2'], adapter_weights=[1] * 2)

how to quantization and save_pretrained?

@yiyixuxu @DN6

Nov 26 '25 10:11 chaowenguo

I think you need to fuse the loras then save. Possibly you only need to save the two transformers as well

Dec 11 '25 17:12 JoeGaffney