InvokeAI
InvokeAI copied to clipboard
[bug]: RuntimeError: All input tensors need to be on the same GPU, but found some tensors to not be on a GPU:
Is there an existing issue for this problem?
- [x] I have searched the existing issues
Operating system
Windows
GPU vendor
None (CPU)
GPU model
AMD Radeon
GPU VRAM
No response
Version number
5.6.2
Browser
edge
Python dependencies
{ "accelerate": "1.0.1", "compel": "2.0.2", "cuda": null, "diffusers": "0.31.0", "numpy": "1.26.4", "opencv": "4.9.0.80", "onnx": "1.16.1", "pillow": "11.1.0", "python": "3.11.11", "torch": "2.4.1+cpu", "torchvision": "0.19.1", "transformers": "4.46.3", "xformers": null }
What happened
Just installed the flux starter package - try to create any image and get RuntimeError: All input tensors need to be on the same GPU, but found some tensors to not be on a GPU: [(torch.Size([1, 98304]), device(type='cpu')), (torch.Size([3072]), device(type='cpu')), (torch.Size([3072, 64]), device(type='cpu'))] and production exits
What you expected to happen
An image to be processed
How to reproduce the problem
Install on Windows, download the Flux model starter pack, create an image
Additional context
Processor AMD Ryzen 9 8945HS w/ Radeon 780M Graphics 4.00 GHz Installed RAM 64.0 GB (61.8 GB usable)
Edition Windows 11 Pro Version 24H2 Installed on 02/02/2025 OS build 26100.3194 Experience Windows Feature Experience Pack 1000.26100.48.0
Discord username
No response
Same problem for me.
It is a fresh installation of Ubuntu specially for InvokeAI (don't work on Manjaro).
CPU : Ryzen 2600 GPU : RX 5500 XT (8GB) RAM : 48 GB OS : Ubuntu Invoke CE (AppImage) : v5.7.1
Low-VRAM and keep copy on RAM both activated.
On Linux, at least, placing the following code in your .bashrc (or .zshrc, if using zsh), worked for me, and has solved previous issues with other CUDA/pytorch systems:
export 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:256'
You may have to tinker around with the split size until it works for you. This immediately solved the error message mentioned here for me.
On Linux, at least, placing the following code in your
.bashrc(or.zshrc, if using zsh), worked for me, and has solved previous issues with other CUDA/pytorch systems:export 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:256'
You may have to tinker around with the split size until it works for you. This immediately solved the error message mentioned here for me.
This is for Nvidia, is there an equivalent for AMD ?
Same here, but running Linux on AMD.
Server Error RuntimeError: All input tensors need to be on the same GPU, but found some tensors to not be on a GPU: [(torch.Size([256, 4096]), device(type='cpu'))]
Is there a solution to this?