ManiSkill2-Learn
ManiSkill2-Learn copied to clipboard
Reproducing BC baseline results on soft body envs
I'm having trouble reproducing results on Pinch-v0. I was able to get Write-v0 and Hang-v0 working though.
Here's the commands I'm running: demo conversion with general_soft_body_envs.txt and scripts/example_training/bc_soft_body_pointcloud.sh
:
python maniskill2_learn/apis/run_rl.py configs/brl/bc/rgbd_soft_body.py \
--work-dir workdir/ --gpu-ids 0 \
--cfg-options "env_cfg.env_name=Pinch-v0" "env_cfg.obs_mode=rgbd" "env_cfg.n_points=1200" \
"env_cfg.reward_mode=dense" \
"env_cfg.control_mode=pd_ee_delta_pose" \
"replay_cfg.buffer_filenames=../ManiSkill2/demos/soft_body_envs/Pinch-v0/trajectory.none.pd_ee_delta_pose_rgbd.h5" \
"replay_cfg.num_samples=50" "replay_cfg.cache_size=1024" \
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" \
"train_cfg.n_eval=50000" "train_cfg.total_steps=50000" "train_cfg.n_checkpoint=50000" "train_cfg.n_updates=500"
I've also tried with env_cfg.control_mode=pd_ee_target_delta_pose
How much memory would be needed to run with replay_cfg.num_samples=-1
? Or is there a better way of training with all 1500+ demos using replay_cfg.dynamic_loading=True
?
Pinch-v0
BC demos contain target images, so it consumes lots of memory.
I recommend modify the demo replay buffer config file in this case:
demo_replay_cfg=dict(
type="ReplayMemory",
capacity=int(2e4),
num_samples=-1,
cache_size=int(2e4),
dynamic_loading=True,
synchronized=False,
keys=["obs", "actions", "dones", "episode_dones"],
buffer_filenames=[
"PATH_TO_DEMO.h5",
],
),
i.e. thru demo_replay_cfg.dynamic_loading=True
demo_replay_cfg.capacity=20000
demo_replay_cfg.cache_size=20000
demo_replay_cfg.num_samples=-1
; this will load all demo data dynamically.
For BC, there is only 1 replay buffer, so replace the above demo_replay_cfg
with replay_cfg
.
Note that for non-BC algorithms, demo_replay_cfg
is not the same as replay_cfg
, i.e. demo replay buffer is a separate buffer from the (online) replay buffer for collecting online environment trajectories
I'm running on a machine with a 3090 and 64GB RAM, so I lowered to replay_cfg.capacity=5000
and replay_cfg.cache_size=5000
. Pinch-v0/trajectory.none.pd_ee_delta_pose_pointcloud.h5
is 40GB.
python maniskill2_learn/apis/run_rl.py configs/brl/bc/pointnet_soft_body.py \
--work-dir workdir/ --gpu-ids 0 \
--cfg-options "env_cfg.env_name=Pinch-v0" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200" "env_cfg.obs_frame=ee" \
"env_cfg.reward_mode=dense" \
"env_cfg.control_mode=pd_ee_delta_pose" \
"replay_cfg.buffer_filenames=../ManiSkill2/demos/soft_body_envs/Pinch-v0/trajectory.none.pd_ee_delta_pose_pointcloud.h5" \
"replay_cfg.capacity=5000" "replay_cfg.num_samples=-1" "replay_cfg.cache_size=5000" \
"replay_cfg.dynamic_loading=True" "replay_cfg.synchronized=False" \
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" \
"train_cfg.n_eval=50000" "train_cfg.total_steps=50000" "train_cfg.n_checkpoint=50000" "train_cfg.n_updates=500"
Still unable to train pointcloud BC baseline. GPU utilization shows 0%, occasionally increasing to 3-8%. Attached the log.
Does it report anything if you set train_cfg.n_updates=5
?
If it reports, then it means it's training, it's just really slow due to file io.
BTW Is the demo stored on ssd?
Also you can do some custom processing in env wrappers and implement new architectures if you implement your own approach, since Pinch-v0
indeed has the largest observation space among all envs (for default wrapper, we only downsample the observation point cloud, but not the target_rgb, target_points, or target_depth)
Yes, the demos are on root ssd.
Seems to start training, but grad_norm becomes 0 pretty quickly. True for env_cfg.control_mode=pd_ee_target_delta_pose
and env_cfg.control_mode=pd_ee_delta_pose
python maniskill2_learn/apis/run_rl.py configs/brl/bc/pointnet_soft_body.py --work-dir workdir /
--gpu-ids 0 --cfg-options "env_cfg.env_name=Pinch-v0" "env_cfg.obs_mode=pointcloud" \
"env_cfg.n_points=1200" "env_cfg.obs_frame=ee" \
"env_cfg.reward_mode=dense" "env_cfg.control_mode=pd_ee_target_delta_pose" \
"replay_cfg.buffer_filenames=../ManiSkill2/demos/soft_body_envs/Pinchv0/trajectory.none.pd_ee_target_delta_pose_pointcloud.h5" \
"replay_cfg.capacity=2000" "replay_cfg.num_samples=-1" "replay_cfg.cache_size=2000" \
"replay_cfg.dynamic_loading=True" "replay_cfg.synchronized=False" \
"eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" "train_cfg.n_eval=50000" \
"train_cfg.total_steps=50000" "train_cfg.n_checkpoint=50000" "train_cfg.n_updates=10"
Was able to reproduce it for point cloud BC. Though for RGB-D BC, the gradient does not fall to zero. (RGB-D BC also requires more memory).