Joe Gaffney
Joe Gaffney
Would be great having more ways to reduce memory in video generation aside from the usual quantize and offloading (which do help a ton). WAN in particular uses a ton...
I think it used to work with ``` python from diffusers import export_to_video ``` But at some point my IDE linter was complaining so switched to importing from diffusers.utils
I noticed this issue with 16by 9 ratio. Glad you found what it was
Hey, I got some pretty good results doing. ``` python def image_edit_call(context: ImageContext): # see https://github.com/huggingface/diffusers/pull/12453/files import diffusers.pipelines.qwenimage.pipeline_qwenimage_edit_plus as qwen_edit_module qwen_edit_module.VAE_IMAGE_SIZE = context.width * context.height # gather all possible reference...
Hey, be interesting to know the rough before and after metrics? Be great if this does reduce memory as wan really shoots up in memory with resolution and time increases.
> The time it takes to load the model into ram and vram varies, sometimes it is 12 seconds + 22 seconds, sometimes it is 21 seconds + 49 seconds....
I think you need to fuse the loras then save. Possibly you only need to save the two transformers as well
> When they release their Edit version, the internet will go down in flames. This model is seriously the definition of a game-changer. Possibly the edit model would mostly negate...
maybe the flow shift. ``` python pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=3.0) ``` I know this fused lighting model works when quantized and offloaded. But i only quant the text encoder to...
> This is also happening on image segmentation,extra vram(about 2g in my case) not released even after unloading model and finished processing. Interesting I didn't test the image one to...