Hector Kohler issues

Results 4 issues of


                                            Hector Kohler

PPO agent does not work (RunTimError: IndexError).

There is a mistake in the computations of the advantages and returns. Indeed, the `_compute_returns_avantages()` function assumes that list `rewards` is of size `self.horizon` . However it is not always...

Actor Critic Agents are less sample efficient in general (?) since #290

@mmcenta , it seems that some changes in the model since #290 are making A2C and PPO worse on some benchmarks in particular @YannBerthelot 's probing environment tests. Let's discuss

bug

enhancement

question

discussion

update user guide

## Description PR for issues #325 #353 I remove some place holders in the user guide that we would probably never do and I did a user guide page for...

documentation

ready for review

Marathon

Use sb3 and gymnasium as much as possible rather then hide imports in rlberry's API

Rather than using ```rlberry.envs.gym_make``` let us use ```gymnasium.make``` Rather than using ```rlberry.agents.torch``` let us use [rlberry.agents.stable_baselines](https://rlberry-py.github.io/rlberry/generated/rlberry.agents.stable_baselines.StableBaselinesAgent.html#rlberry.agents.stable_baselines.StableBaselinesAgent) Same for atari_make and spaces.... basically if an rlberry class is just import a...

enhancement

good first issue