mbrl-lib icon indicating copy to clipboard operation
mbrl-lib copied to clipboard

[Bug] What's the real_data_ratio to hopper and walker2d?

Open chenxi-yang opened this issue 1 year ago • 3 comments

Hi I can not make hopper and walker2d work in this default setting. May I ask if you set a >0 real_data_ratio to these two experiments? Thanks!

chenxi-yang avatar Sep 29 '22 21:09 chenxi-yang

Hi @chenxi-yang, thanks for the report. Can you be more explicit on the errors you are getting? Is it not learning at all, or is it that the reward less than you expect? I don't think I've ever used real_data > 0.

luisenp avatar Oct 01 '22 11:10 luisenp

Hi, I tried a few other settings. To elaborate my question a bit, I can have a good final reward for my policy. However, the training curve is unstable as below (having peaks during training). The command I used in CUDA_VISIBLE_DEVICES=6 python -m mbrl.examples.main algorithm=mbpo overrides=mbpo_hopper dynamics_model=gaussian_mlp

image

chenxi-yang avatar Oct 03 '22 19:10 chenxi-yang

Ah, got it. Unfortunately, the peaky behavior is definitely an issue for our MBPO implementation; certainly it is for Hopper, but I think also for other domains. Not sure if this is the cause, but when I swept for hyperparameters, I roughly optimized for area under curve on a single seed, rather than stable behavior. So, it's possible that this could be addressed by tweaking hyperparameters, but I haven't looked into this.

luisenp avatar Oct 03 '22 19:10 luisenp

Thanks for the update.

chenxi-yang avatar Oct 26 '22 03:10 chenxi-yang