fastsdcpu
fastsdcpu copied to clipboard
Add experimental support for loading .safetensors models
As the title says, this PR adds experimental support for loading models in safetensors format; I call it experimental because, even if it technically works, I've noticed that rebuilding the pipeline a couple of times, for example by enabling/disabling ControlNet, causes RAM usage to increase considerably making it impractical for its use on low-end PCs.
Some other things to note in this PR:
- Loading safetensors models only works in LCM-LoRA mode, I think it's the only mode where it makes sense.
- LCM-LoRA mode was always considerably slower than plain LCM mode on my machine, about 50% slower, I always thought that it was a speed hit from the LCM-LoRA mode itself but I noticed that by commenting this line:
pipeline.unet.to(memory_format=torch.channels_last)inference speed automatically became much better. Is there any particular reason why this line is used?. - By going further and always fusing the LCM-LoRA with the base model, inference speed in LCM-LoRA mode gets almost as fast as inference speed in plain LCM mode.
pipe = pipeline_class( TypeError: StableDiffusionPipeline.init() got an unexpected keyword argument 'text_encoder_2'
I tried to load sd_xl_base_1.0.safetensors
It loads fine the same model as diffusers
Am i missing something?
Sorry, should have mentioned that this PR works only for SD 1.5 models. A similar code should work for SDXL but unfortunately I can't run SDXL on my machine.
@monstruosoft Going to merge this PR,thanks