Johnny He
Johnny He
Following readme.md When I run ``` python -m qmap.train_mario --render ``` The result is: ``` Traceback (most recent call last): File "/home/mirror/anaconda3/envs/qmap/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/home/mirror/anaconda3/envs/qmap/lib/python3.6/runpy.py",...
In the original SAC paper, the J_pi is: data:image/s3,"s3://crabby-images/e0947/e0947478d6577c409369287abb6d7ebfe0f5260d" alt="image" However, in your implementation, your code is: ``` policy_loss = (log_prob * (log_prob - log_prob_target).detach()).mean() mean_loss = mean_lambda * mean.pow(2).mean() std_loss...
I notice that some environments in the DMControl suite may take more than one CPU thread. For example, if we choose Domain_name=finger, Task_name=spin. I run this task in a server...
Hi, Kumar! In the last [issue](https://github.com/aviralkumar2907/BEAR/issues/6), you mentioned that you don't test BEAR on the final buffer setting and recommend me using d4rl datasets. Following your comments, I use d4rl...