D4RL
D4RL copied to clipboard
A collection of reference environments for offline reinforcement learning
Hi, I notice there are differences between results reported in CQL paper and D4RL paper for this benchmark. Since some of the authors are common for both papers, can you...
Currently the [reproducibility guide](https://github.com/rail-berkeley/d4rl/wiki/Dataset-Reproducibility-Guide) in the Mujoco section doesn't contain information about which version of the environments were used (`Hopper-v2` vs `Hopper-v3`), nor this is present anywhere else in the...
What `mujoco` version is used for collecting the datasets? I observed that the latest gym version discontinued support for `mujoco>=2.0` whereas `mujoco_py` readme file shows instructions for installing `mujoco 2.0`....
Hi, as stated in the README, "infos" contained in each dataset is the task-specific debugging information, but what is it exactly for kitchen environment? And where can I find descriptions...
Hello, Thank you for making this available. However, holding MuJoCo as a dependency restricts open and free research. MuJoCo has a heavy license fee and private student licenses can't be...
There seems to be a single terminal = true flag in each of the half cheetah datasets. Do you know why this is the case? The half cheetah gym environment...
First, thank you for sharing the repo! The dataset seems to consists of state-action pairs, is there a way to recover entire rollout of a policy?
Hi, I could not find the propensities of the logging policy in the dataset. Can they be made available, since importance weighted methods would benefit from that knowledge? Thanks!
Hi, I would like to use d4rl dataset for DICE scenarios, where sampling from initial states is required. I thought the termination flag could be helpful at first glance, but...
The RLkit version changed a lot and for example FlattenMlp does not exist anymore. Any recommendation which RLkit version is compatible with your repository? Thank you in advance. :) Best