`from_pipe` converts pipelines to float32 by default
Describe the bug
Pipelines passed to from_pipe() are converted to float32 unless torch_dtype is specified, leading to higher memory usage and slower inference.
Reproduction
import torch
from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline
pipe = StableDiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", torch_dtype=torch.float16).to("cuda")
print(f"Before: {pipe.dtype} - {torch.cuda.memory_allocated() // 1048576} MB")
i2i = StableDiffusionImg2ImgPipeline.from_pipe(pipe)
print(f"After: {pipe.dtype} - {torch.cuda.memory_allocated() // 1048576} MB")
Logs
Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]`torch_dtype` is deprecated! Use `dtype` instead!
Loading pipeline components...: 100%|████████████████████████████████████████████████████| 7/7 [00:08<00:00, 1.17s/it]
Before: torch.float16 - 2637 MB
After: torch.float32 - 5258 MB
System Info
- 🤗 Diffusers version: 0.35.2
- Platform: Windows-10-10.0.19045-SP0
- Running on Google Colab?: No
- Python version: 3.10.11
- PyTorch version (GPU?): 2.9.1+cu126 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.36.0
- Transformers version: 4.57.3
- Accelerate version: 1.12.0
- PEFT version: not installed
- Bitsandbytes version: not installed
- Safetensors version: 0.7.0
- xFormers version: not installed
- Accelerator: NVIDIA GeForce GTX 1080, 8192 MiB
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No
Who can help?
No response
hey! you need to pass torch_dtype in from_pipe
StableDiffusionImg2ImgPipeline.from_pipe(pipe, torch_dtype=torch.float16)
Yes, but that shouldn't be necessary, nor do the docs mention it. It's meant to avoid allocating more memory, so shouldn't it keep the existing dtype?
Agreed that this a regression in terms of DX.
Had some code that broke because of this change back when it was pushed out. Originally, the same dtype would be maintained.
we try to keep API more or less consistent with from_pretrained(), instead of from a repo, you can create from an existing pipeline, but it accept other arguments from_pretrained() would accept, including dtype
we can update doc to include that info
if you look at from_pretrained, we don't maintain the original dtype of the checkpoint either
@yiyixuxu That does make sense. In practice it's a little cumbersome, but I understand wanting to unify the pattern with from_pretrained.
No problems on my end now, we've updated all our code to reflect the current state.
The original pipeline is affected as well, so it may be better to not change it by default.
so, to clarify our design decision here:
We don't al all guarantee that the original pipeline remains unchanged after from_pipe().
It would be nice if we could preserve the original pipeline's state, but handling all the edge cases around shared components would get complicated. We decided to keep things simple and maintain API consistency: by default, all pipelines are created in float32, whether via from_pretrained() or from_pipe(). If you need a different dtype, you pass torch_dtype explicitly in both cases.
we can definitely improve the docs to make this clearer though
You can preserve the original pipeline's state here, simply by not changing it. Isn't that the point, to share components between pipelines?
But I think we're both making this a bigger deal than it needs to be.
Basically, the problem I have with it is this: In the case of from_pipe, we've already told it what dtype we want. We shouldn't have to tell it again. None of the other options we've set get undone, so why this one?