SO2
SO2 copied to clipboard
[AAAI2024] A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
SO2: A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
SO2 offers a fresh perspective on offline-to-online reinforcement learning through Q-value estimation and presents a straightforward yet effective implementation.

SO2: A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Yinmin Zhang, Jie Liu, Chuming Li, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang arXiv:2312.07685
Updates
- (12/2024) Code has been released!
Installation
All scripts
The scripts from installation to execution are all here👏.
# install MuJoCo for Linux
mkdir -p ~/.mujoco/mujoco210
wget https://mujoco.org/download/mujoco210-macos-x86_64.tar.gz -O mujoco210-macos-x86_64.tar.gz
tar -xf mujoco210-linux-x86_64.tar.gz -C ~/.mujoco/mujoco210
pip install -U 'mujoco-py<2.2,>=2.1'
# install D4RL
pip install git+https://github.com/Farama-Foundation/d4rl@master#egg=d4rl
# install SO2
git clone https://github.com/opendilab/SO2
cd SO2
pip install -r requirements.txt
# run d4rl experiment
python3 -u so2/d4rl_main.py
Install environment
Install MuJoCo and mujoco-py
- Download the MuJoCo version 2.1 binaries for Linux or OSX.
- Extract the downloaded
mujoco210directory into~/.mujoco/mujoco210.
To include mujoco-py in your own package, add it to your requirements like so:
pip install -U 'mujoco-py<2.2,>=2.1'
Install D4RL
D4RL can be installed by cloning the repository as follows:
git clone https://github.com/Farama-Foundation/d4rl.git
cd d4rl
pip install -e .
Or, alternatively:
pip install git+https://github.com/Farama-Foundation/d4rl@master#egg=d4rl
Install SO2
git clone https://github.com/opendilab/SO2
cd SO2
pip install -r requirements.txt
Getting Started
MuJoco
python3 -u so2/d4rl_main.py
Results
MuJoCo

License
This project is released under the Apache 2.0 license. See LICENSE for details.
Citing SO2
If you use SO2 in your research or wish to refer to the baseline results published here, please use the following BibTeX entry.
@inproceedings{zhang2023perspective,
title={A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning},
author={Zhang, Yinmin and Liu, Jie and Li, Chuming and Niu, Yazhe and Yang, Yaodong and Liu, Yu and Ouyang, Wanli},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2024}
}