off-policy topic
hindsight-experience-replay
This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.
drq
DrQ: Data regularized Q
exorl
ExORL: Exploratory Data for Offline Reinforcement Learning
curl
CURL: Contrastive Unsupervised Representation Learning for Sample-Efficient Reinforcement Learning
linorobot
Autonomous ground robots (2WD, 4WD, Ackermann Steering, Mecanum Drive)
rad
RAD: Reinforcement Learning with Augmented Data
sunrise
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
off-policy-continuous-control
Official PyTorch code for "Recurrent Off-policy Baselines for Memory-based Continuous Control" (DeepRL Workshop, NeurIPS 21)
flashbax
⚡ Flashbax: Accelerated Replay Buffers in JAX
causal-rl
Causal RL: Reverse-Environment Network Integrated Actor-Critic Algorithm
Reinforcement-Learning-solving-a-simple-4by4-Gridworld-using-Qlearning-in-python
solving a simple 4*4 Gridworld almost similar to openAI gym FrozenLake using Qlearning Temporal difference method Reinforcement Learning