AWAC
AWAC copied to clipboard
Advantage weighted Actor Critic for Offline RL
pytorch-AWAC
Advantage weighted Actor Critic for Offline RL implemented in pytorch.
A cleaner implementation for AWAC built on top of spinning_up SAC, and rlkit.
If you use this code in your research project please cite us as:
@software{Sikchi_pytorch-AWAC,
author = {Sikchi, Harshit and Wilcox, Albert},
doi = {10.5281/zenodo.5121023},
title = {{pytorch-AWAC}},
url = {https://github.com/hari-sikchi/AWAC}
}
Running the code
python run_agent.py --env <env_name> --seed <seed_no> --exp_name <experiment name> --algorithm 'AWAC'
Plotting
python plot.py <data_folder> --value <y_axis_coordinate>
The plotting script will plot all the subfolders inside the given folder. The value is the y-axis that is required. 'value' can be:
- AverageTestEpRet
Results
Environment: Ant-v2
Dataset: https://drive.google.com/file/d/1edcuicVv2d-PqH1aZUVbO5CeRq3lqK89/view
Plot [Compare to Figure 5 of the official paper]:
References
@article{SpinningUp2018, author = {Achiam, Joshua}, title = {{Spinning Up in Deep Reinforcement Learning}}, year = {2018} }
@article{nair2020accelerating, title={Accelerating Online Reinforcement Learning with Offline Datasets}, author={Nair, Ashvin and Dalal, Murtaza and Gupta, Abhishek and Levine, Sergey}, journal={arXiv preprint arXiv:2006.09359}, year={2020} }