lopatovsky
lopatovsky
# Mixed precision **Motivation**: Inspired by RLGames, we implemented automatic mixed double precision to boost performance of PPO. **Sources:** **Speed eval:** - Big neural network (units: \[2048, 1024, 1024, 512])...
Motivation: Currently both train and eval modes are saving checkpoints, and best checkpoint. This was causing some troubles on our side when we implemented parallel validation runs that were then...
### Proposal Implement Reinforcement Learning config class configuring all parameters related to RL ( e.g. RL agent, RL model. ) utilizing modular fashion identical to ManagerBasedRLEnvCfg architecture. This is an...
# Mixed precision **Motivation**: Inspired by RLGames, we implemented automatic mixed double precision to boost performance of PPO_RNN especially for big models. **Sources:** **Speed eval:** - model with one layer...