Edouard Leurent

Results 176 comments of Edouard Leurent

This agent is a *planning* algorithm (Monte-Carlo Tree Search), not a *reinforcement learning* algorithm : it samples transitions from a *known* dynamics to find an optimal trajectory starting from the...

That is very strange, it should be working, all the more if it appears in the `pip3 list`. Maybe check that the repo is indeed cloned into your `/home//.local/lib/python3.7/site-packages`? (since...

Hi, Yes unfortunately CEM is the only method implemented that handles continuous actions, due to the fact that my own work rather focuses on discrete actions. To increase the running...

Hi, > Hi, I'm using MCTS in your rl-agents repo under the env of your another repo highway_env. In agents/common/factory.py, I understand the function safe_deepcopy_env() copies the current state for...

> (1) I modify the original safe_deepcopy_env() in factory.py to the following: > And here is what the console prints(many times): > safe_new_simulation_env vehicle > So I think all the...

> What I want to do is construct two environments. In real env, other vehicles have the 'target_speed' of 20, while in the simulation env, other vehicles have the target_speed...

Hi @hebowei2000, I do not think there is anything wrong with the implemented algorithms (but I may be mistaken). I think you'll find that, perhaps surprisingly, MCTS algorithms are not...

Contributions are absolutely welcome, and guidelines are unfortunately lacking. I'll try to address this. The main steps would be: * Inherit from [AbstractAgent](https://github.com/eleurent/rl-agents/blob/master/rl_agents/agents/common/abstract.py) * You must in particular implement the...

Hello @Pei-w, `in_width` is configuration parameter describing the shape of the inputs of a Convolutional Network. Convolutional Networks are typically used with image-like inputs, which are shaped as (channels, height,...

Hi @DanialTaheri Like the value iteration, the robust value iteration requires the knowledge of the MDP(s), described in the form of a `FiniteMDPEnv`, defined in [this project](https://github.com/eleurent/finite-mdp). (It can also...