ASE
ASE copied to clipboard
Minor bug: agent is using cuda:0 device no matter what rl_device arg is
Problem
-
ase.learning.common_agent.CommonAgent
inheritsrl_games.common.a2c_common.A2CBase
which stores all tensors toself.ppo_device
. -
self.ppo_device
is set by gettingdevice
key fromconfig
. If there is nodevice
key, it is set tocuda:0
by default. (see here) - tracing back to
run.py
file,config
is supplied bycfg_train["params"]["config"]
. You can printcfg_train["params"]["config"].keys()
and there is nodevice
.
How to check
To check this issue, simply run the original pretraining command with --rl_device
argument is set to another cuda device such as cuda:1
and it still consumes cuda:0
memory.
python ase/run.py --task HumanoidAMPGetup --cfg_env ase/data/cfg/humanoid_ase_sword_shield_getup.yaml --cfg_train ase/data/cfg/train/rlg/ase_humanoid.yaml --motion_file ase/data/motions/reallusion_sword_shield/dataset_reallusion_sword_shield.yaml --headless --rl_device cuda:1
How to fix
To fix this, simply add cfg_train["params"]["config"]["device"] = args.rl_device
in function load_cfg()
.