mineral
mineral copied to clipboard
A minimal(ish) reinforcement learning library that aggregates reliable implementations.
mineral
A minimal(ish) reinforcement learning library that aggregates reliable implementations for:
-
PPO, Proximal Policy Optimization (
minimal-stable-PPO,rl_games) #rl -
DDPG, Deep Deterministic Policy Gradient (
pql,drqv2,TD3) #rl -
SAC, Soft Actor Critic (
pql,pytorch-sac-ae) #rl -
APG / BPTT, Analytic Policy Gradient / Backpropagation Through Time (
DiffRL) #rl, #diffsim -
SHAC, Short Horizon Actor Critic (
DiffRL) #rl, #diffsim -
SAPO, Soft Analytic Policy Optimization (
ours) #rl, #diffsim -
BC, Behavioral Cloning #il
-
DAPG, Demo Augmented Policy Gradient (
maniskill2-learn) #il, #rl, #off2on
Tags
| tag | description |
|---|---|
| #rl | (online) reinforcement learning |
| #offrl | offline reinforcement learning |
| #il | (offline) imitation learning |
| #off2on | offline-to-online |
| #diffsim | differentiable simulation |
| #mpc | model predictive control |
Setup
conda create -n mineral python=3.10
conda activate mineral
pip install "torch>=2" torchvision
pip install git+https://github.com/etaoxing/mineral
# Rewarped
# see https://github.com/rewarped/rewarped
# DFlex
pip install gym==0.23.1
pip install git+https://github.com/rewarped/DiffRL
# make sure to run this so DFlex kernels are built
python -m dflex.examples.test_env --env AntEnv --num-envs 4 --render
# IsaacGymEnvs
# NOTE: requires python3.8
tar -xvf IsaacGym_Preview_4_Package.tar.gz
ln -s <path/to/isaacgym> thirdparty/isaacgym
cd third_party/isaacgym
pip install . --no-dependencies
cd -
git clone https://github.com/isaac-sim/IsaacGymEnvs third_party/IsaacGymEnvs
cd third_party/IsaacGymEnvs
git checkout b6dd437e68f94255f5a6306da76f2f0f9a634d6e
# comment out rl_games in INSTALL_REQUIRES of setup.py
pip install .
cd -
# ImportError: libpython3.8.so.1.0: cannot open shared object file: No such file or directory
# use `import_isaacgym()` in mineral/envs/isaacgymenvs.py
Usage
See commands in examples/:
RewarpedDFlexIsaacGymEnvs
Want to use your own configs, agents, or envs? Check out examples/run.py, and replace python -m mineral.scripts.run ... with python -m examples.run 'hydra.searchpath=[pkg://examples/cfgs]' ... to load configs from examples/cfgs/.
Use CUDA_VISIBLE_DEVICES=1 python ... to run on a specific GPU.
Use python ... run=eval task.env.render=True ckpt="workdir/<exp>/<run>/ckpt/final.pth" to load checkpoints and visualize agents (trajectories are saved as USDs).