diffusers
diffusers copied to clipboard
[Exception]Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!
Describe the bug
Scenery: Developing a face editing server using fastapi+celery+diffusers, each request will generate a celery task.
Concurrence: 2 workers for fastapi, 1 worker for celery
Diffusers usage:
- 1 StableDiffusionControlNetPipeline + 3 StableDiffusionControlNetInpaintPipeline
- share two checkpoints' components
- enable_model_cpu_offload turned on
Exception: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument weight in method wrapper_CUDA___slow_conv2d_forward)
Reproduction
pesudo-code(real code are async tasks developed with celery):
inpainting_pipeline = StableDiffusionInpaintPipeline.from_single_file(
os.path.join(pretrained_model_dir, "realisticVisionV60B1_v51VAE-inpainting.safetensors", ),
original_config_file=sd_config_file,
local_files_only=True,
use_safetensors=True,
torch_dtype=torch.float16
)
text2img_pipeline = StableDiffusionPipeline.from_single_file(
os.path.join(pretrained_model_dir, "realisticVisionV60B1_v51VAE.safetensors", ),
original_config_file=sd_config_file,
local_files_only=True,
use_safetensors=True,
torch_dtype=torch.float16)
pipeline_1 = StableDiffusionControlNetInpaintPipeline(
**inpainting_pipeline.components,controlnet = controlnet_inpaint)
pipeline_2 = StableDiffusionControlNetPipeline(
**text2img_pipeline.components,controlnet = controlnet_openpose)
pipeline_3 = StableDiffusionControlNetInpaintPipeline(
**inpainting_pipeline.components,controlnet = controlnet_lineart)
pipeline_4 = StableDiffusionControlNetInpaintPipeline(
**text2img_pipeline.components,controlnet = controlnet_lineart) # reuse the same controlnet_lineart
pipeline_1~pineline_4 all turn on enable_model_cpu_offload, no call to function .to("cuda")
Logs
Traceback (most recent call last):
File "/root/projects/octaface/tasks.py", line 274, in change_hair_color_task
output_image = face_editing.change_hair_color(input_image_file=cached_photo_file_path,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/projects/octaface/edit/face.py", line 1493, in change_hair_color
output_image = self.color_inpaint_pipe(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint.py", line 1437, in __call__
latents_outputs = self.prepare_latents(
^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint.py", line 994, in prepare_latents
image_latents = self._encode_vae_image(image=image, generator=generator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint.py", line 1072, in _encode_vae_image
image_latents = retrieve_latents(self.vae.encode(image), generator=generator)
^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/diffusers/models/autoencoders/autoencoder_kl.py", line 260, in encode
h = self.encoder(x)
^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/diffusers/models/autoencoders/vae.py", line 143, in forward
sample = self.conv_in(sample)
^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 460, in forward
return self._conv_forward(input, self.weight, self.bias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/octaface/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument weight in method wrapper_CUDA___slow_conv2d_forward)
System Info
diffusersversion: 0.27.2- Platform: Linux-5.4.0-153-generic-x86_64-with-glibc2.35
- Python version: 3.11.9
- PyTorch version (GPU?): 2.2.1+cu121 (True)
- Huggingface_hub version: 0.22.1
- Transformers version: 4.39.1
- Accelerate version: 0.28.0
- xFormers version: 0.0.25
- Using GPU in script?: python celery tasks
- Using distributed or parallel set-up in script?: only 1 worker for celery
Who can help?
No response