CppMaster

Results 3 issues of CppMaster

### 🚀 Feature An option to collect rollout for n_episodes instead of n_steps for on policy algorithms. ### Motivation Some environments, like games, have the most important reward at the...

enhancement

**Motivation** MaskablePPO is great for large discrete action space that has many invalid actions at each step, while RecurrentPPO is useful for the agent to has a memory of previous...

duplicate
enhancement

# 🌟 New adapter setup ## Model description [Canine-c](https://huggingface.co/google/canine-c) "CANINE pre-trained with autoregressive character loss" ## Open source status * [x] the model implementation is available: [huggingface github] (https://github.com/huggingface/transformers/blob/main/src/transformers/models/canine/modeling_canine.py) *...

enhancement