ChatLaw icon indicating copy to clipboard operation
ChatLaw copied to clipboard

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!

Open nuaabuaa07 opened this issue 1 year ago • 6 comments

步骤3:合并ChatLaw权重并推理 ,这一步骤在执行时,报错。RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select) 。是不支持在多卡的机器上创建推理吗?

nuaabuaa07 avatar Sep 04 '23 12:09 nuaabuaa07

难道,推理服务,只能部署在单GPU的机器上?

nuaabuaa07 avatar Sep 04 '23 13:09 nuaabuaa07

单卡时报内存不足。 torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 22.20 GiB total capacity; 21.53 GiB already allocated; 48.12 MiB free; 21.55 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

nuaabuaa07 avatar Sep 04 '23 13:09 nuaabuaa07

我也遇到了相同的问题,但是我是单卡机器也是报错 RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 请问有解决方案或者排查思路吗?

niceyida avatar Dec 05 '23 02:12 niceyida

我也遇到了相同的问题,但是我是单卡机器也是报错 RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 请问有解决方案或者排查思路吗?

找到解决方法了,因为transforms版本过高导致的报错,回退到4.29.0之后,问题解决

niceyida avatar Dec 05 '23 02:12 niceyida

我也遇到了相同的问题,但是我是单卡机器也是报错 RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 请问有解决方案或者排查思路吗?

找到解决方法了,因为transforms版本过高导致的报错,回退到4.29.0之后,问题解决

没找到这个版本啊,请问你是怎么安装的? ` (base) ➜ pip install transforms==4.29.0 ERROR: Could not find a version that satisfies the requirement transforms==4.29.0 (from versions: 0.1, 0.2.0, 0.2.1) ERROR: No matching distribution found for transforms==4.29.0

`

lichenyigit avatar Dec 17 '23 08:12 lichenyigit

我也遇到了相同的问题,但是我是单卡机器也是报错 RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 请问有解决方案或者排查思路吗?

找到解决方法了,因为transforms版本过高导致的报错,回退到4.29.0之后,问题解决

没找到这个版本啊,请问你是怎么安装的? ` (base) ➜ pip install transforms==4.29.0 ERROR: Could not find a version that satisfies the requirement transforms==4.29.0 (from versions: 0.1, 0.2.0, 0.2.1) ERROR: No matching distribution found for transforms==4.29.0

` 不好意思,上面单词拼写有误,应该是transformers,请参考https://pypi.org/project/transformers/#history

niceyida avatar Dec 18 '23 02:12 niceyida