RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)

Open Ricardo-Charles opened this issue 4 years ago • 1 comments

作者你好, 我下载了你的代码, 运行的时候碰到 RuntimeError, 谷歌了很久也没找到具体原因, 完整报错信息如下: Traceback (most recent call last): File "/content/drive/MyDrive/QMIX/main.py", line 105, in train() File "/content/drive/MyDrive/QMIX/main.py", line 37, in train episode, _, _ = rollout_worker.generate_episode(episode_idx) File "/content/drive/MyDrive/QMIX/utils.py", line 49, in generate_episode action = self.agents.choose_action(obs[agent_id], last_action[agent_id], agent_id, avail_action, epsilon, evaluate) File "/content/drive/MyDrive/QMIX/agent.py", line 37, in choose_action q_value, self.policy.eval_hidden[:, agent_num, :] = self.policy.eval_drqn_net(inputs, hidden_state) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/content/drive/MyDrive/QMIX/NN.py", line 16, in forward h = self.rnn(x, h_in) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/rnn.py", line 1133, in forward self.bias_ih, self.bias_hh, RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)

Mar 22 '21 08:03 Ricardo-Charles

抱歉，最近在肝论文没有及时回复您。一般遇到这种问题是因为载入的数据是在cpu上的，而模型已经被加载到GPU上。

你可以定位到出错的代码，有两种解决方法。

如果你想在CPU跑，那么就把你需要的tensor 使用 .cpu() 函数抽取出来, 然后进行下面的计算; 2.如果你想在GPU上跑, 可以用 .cuda()函数, 把数据加载到GPU上运行.

我的代码可能适配做的不太好, 论文写完我会改进. 对于这个问题的话, 你可以自己先尝试修改config.py 下的self.cuda = False字段为True.

然后把utils.py文件的generate_episode() 函数中的相关变量都加上.to(self.device).后续如果还有什么问题我再回答你.

附一个,参考链接, 你可以按照里边修改. RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm) - PyTorch Forums

祝好!

------------------ 原始邮件 ------------------ 发件人: "thesouther/MARL" @.>; 发送时间: 2021年3月22日(星期一) 下午4:33 @.>; @.***>; 主题: [thesouther/MARL] RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm) (#2)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Mar 27 '21 15:03 thesouther