Aviral Kumar comments

Results 10 comments of


                                            Aviral Kumar

About the readability

Hello @familyld, Yes, I hope to add some more comments in about a month timeline. Sorry for the delay in doing that.

Couldn't reproduce the result on MuJoCo Suite (d4rl datasets).

Hi, that's unfortunate, but can you try with these hyperparameters (I think the hyperparameters mentioned in bear.py by default are not the most ideal): - Hopper: `kernel_type=laplacian`, `mmd_sigma=20`, `num_samples=100` -...

Couldn't reproduce the result on MuJoCo Suite (d4rl datasets).

I have created a pull request in the d4rl_evaluations repo as well, mentioning these hyperparameters in the readme. Also, which version of the D4RL datasets is this? We have changed/reorganized...

In place operations in algos.py

I am not sure if this is specific to this code, or coming from the optimizer and pytorch. I do not know how to fix this, but very likely there...

QF_Loss backprops policy network

@olliejday I think the error is caused due to pytorch version. If you try like torch 1.4 that could fix it. Something more might break it. Could you please confirm...

QF_Loss backprops policy network

@olliejday @dosssman The Q-function detach will not work, since then the policy is not trained using the Q-function which is incorrect.

Discrepancy between results reported in CQL and D4rl papers

Hi, CQL reported numbers from the first arxiv version of the D4RL paper, which (for BEAR) have then improved in the newer version of D4RL. We will update the numbers...

Discrepancy between results reported in CQL and D4rl papers

The numbers in the NeurIPS version of the CQL paper: https://proceedings.neurips.cc/paper/2020/file/0d2b2061826a5df3221116a5085a6052-Paper.pdf are supposed to be used as reference, which refers to the table you mentioned. The original CQL paper (old...

RRT integration issues

Can you tell the path to the nav_*.py file describing hybrid automaton for the case you ran the rrt code? The first error seems to be an error, where the...

RRT integration issues

I think for the true invariant, we can simply write it as "True" (boolean) and I will incorporate this in the RRT code (utils.py) .