imitation Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Bug description

Traceback (most recent call last):
  File "/home/kavin/Documents/PycharmProjects/RL/Imitation/example.py", line 150, in <module>
    bc_trainer.train(n_epochs=1)
  File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/imitation/algorithms/bc.py", line 495, in train
    training_metrics = self.loss_calculator(self.policy, obs_tensor, acts)
  File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/imitation/algorithms/bc.py", line 130, in __call__
    (_, log_prob, entropy) = policy.evaluate_actions(
  File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/stable_baselines3/common/policies.py", line 736, in evaluate_actions
    log_prob = distribution.log_prob(actions)
  File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/stable_baselines3/common/distributions.py", line 292, in log_prob
    return self.distribution.log_prob(actions)
  File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/torch/distributions/categorical.py", line 127, in log_prob
    return log_pmf.gather(-1, value).squeeze(-1)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA_gather)

It seems that it cannot train with a GPU. I got this error when I set the device to "cuda:0", but not when I set it to "cpu".

Steps to reproduce

print("Loading expert demonstrations...")
rng = np.random.default_rng(0)

env = gym.make("CartPole-v1")
venv = make_vec_env("CartPole-v1", post_wrappers=[lambda env, _: RolloutInfoWrapper(env)], rng=rng)
expertAgent = PPO.load("ppo_cartpole.zip", print_system_info=False)

print("Rollouts...")
rollouts = rollout.rollout(
    expertAgent,
    venv,
    rollout.make_sample_until(min_timesteps=None, min_episodes=60),
    rng=rng,
)

bc_trainer = bc.BC(
    observation_space=venv.observation_space,
    action_space=venv.action_space,
    demonstrations=rollouts,
    rng=rng,
    device="cuda:0",
)

reward, _ = evaluate_policy(
    bc_trainer.policy,
    venv,
    n_eval_episodes=3,
    render=False,
)
print(f"Reward before training: {reward}")

print("Training a policy using Behavior Cloning")
bc_trainer.train(n_epochs=1)

reward, _ = evaluate_policy(
    bc_trainer.policy,
    env,
    n_eval_episodes=3,
    render=False,
)
print(f"Reward after training: {reward}")

Environment

Operating system and version: Ubuntu 20.04
Python version: 3.8
Output of pip freeze --all: stable-baselines3==2.2.0

Nov 18 '23 12:11 kavinwkp

I get this error too. RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA_gather)

It appears on call to train(), at least on two algos. dagger_trainer.train() and bc_trainer.train()

imitation version 1.0

Dec 16 '23 08:12 Rajesh-Siraskar

This error occur because the function safe_to_tensor in imitation/src/imitation/util/util.py return a tensor without transferring to 'cuda'.

fix the safe_to_tensor function: replace line 259: return array by the code: return th.as_tensor(array, **kwargs)

Apr 04 '24 05:04 tvietphu

imitation imitation copied to clipboard

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Bug description

Steps to reproduce

Environment

imitation
imitation copied to clipboard