mineral

A minimal(ish) reinforcement learning library that aggregates reliable implementations for:

PPO, Proximal Policy Optimization (minimal-stable-PPO, rl_games) #rl
DDPG, Deep Deterministic Policy Gradient (pql, drqv2, TD3) #rl
SAC, Soft Actor Critic (pql, pytorch-sac-ae) #rl
APG / BPTT, Analytic Policy Gradient / Backpropagation Through Time (DiffRL) #rl, #diffsim
SHAC, Short Horizon Actor Critic (DiffRL) #rl, #diffsim
SAPO, Soft Analytic Policy Optimization (ours) #rl, #diffsim
BC, Behavioral Cloning #il
DAPG, Demo Augmented Policy Gradient (maniskill2-learn) #il, #rl, #off2on

tag	description
#rl	(online) reinforcement learning
#offrl	offline reinforcement learning
#il	(offline) imitation learning
#off2on	offline-to-online
#diffsim	differentiable simulation
#mpc	model predictive control

Setup

conda create -n mineral python=3.10
conda activate mineral

pip install "torch>=2" torchvision
pip install git+https://github.com/etaoxing/mineral

# Rewarped
# see https://github.com/rewarped/rewarped

# DFlex
pip install gym==0.23.1
pip install git+https://github.com/rewarped/DiffRL
# make sure to run this so DFlex kernels are built
python -m dflex.examples.test_env --env AntEnv --num-envs 4 --render

# IsaacGymEnvs
# NOTE: requires python3.8
tar -xvf IsaacGym_Preview_4_Package.tar.gz
ln -s <path/to/isaacgym> thirdparty/isaacgym
cd third_party/isaacgym
pip install . --no-dependencies
cd -
git clone https://github.com/isaac-sim/IsaacGymEnvs third_party/IsaacGymEnvs
cd third_party/IsaacGymEnvs
git checkout b6dd437e68f94255f5a6306da76f2f0f9a634d6e
# comment out rl_games in INSTALL_REQUIRES of setup.py
pip install .
cd -
# ImportError: libpython3.8.so.1.0: cannot open shared object file: No such file or directory
# use `import_isaacgym()` in mineral/envs/isaacgymenvs.py

Usage

See commands in examples/:

Rewarped
DFlex
IsaacGymEnvs

Want to use your own configs, agents, or envs? Check out examples/run.py, and replace python -m mineral.scripts.run ... with python -m examples.run 'hydra.searchpath=[pkg://examples/cfgs]' ... to load configs from examples/cfgs/.

Use CUDA_VISIBLE_DEVICES=1 python ... to run on a specific GPU.

Use python ... run=eval task.env.render=True ckpt="workdir/<exp>/<run>/ckpt/final.pth" to load checkpoints and visualize agents (trajectories are saved as USDs).

mineral
mineral copied to clipboard

Metadata

mineral

Tags

Setup

Usage

← Metadata

Owner

Metadata

mineral mineral copied to clipboard

Metadata

mineral

Tags

Setup

Usage

← Metadata

Owner

Metadata

mineral
mineral copied to clipboard