Federico Andres Lois

Results 33 issues of Federico Andres Lois

I have been playing with the agent and noticed that my Q values are clustered along the [10..20] then I setup vmin=10 and vmax=20 but if vmin is bigger than...

Using Version 0.11.1 I wanted to modify a particular head in order to modify some calculations fullfilling the agent requirements and found that you cannot instantiate the new head if...

Hi, I stumbled upon the following potential improvement, I am hacking it right now, but it would be great to have a proper solution. MCTS and other forward simulation techniques...

priority/p1

Hi @galleibo-intel and team, I am trying to create a Gated PixelCNN embedder for images and train it jointly with the RL model. While creating the 'layer' is kinda straightforward,...

In the training procedure sometimes machines get rebooted, etc. When resume happens from checkpoint automated the next checkpoint is no longer going to continue the numbering from where it is...

Newbie on coach, I would need some advise on how to approach the implementation of ``Learning by Playing – Solving Sparse Reward Tasks from Scratch`` https://arxiv.org/abs/1802.10567 using multiple heads as...

question

I have been using these two routines to figure out the best learning rate to apply with awesome results on SAC. However, the changes in the `temperature` alter those values...

I am not issuing a PR for this one because this uses the modified Dataset based version of the trainer... But making it available through this issue in case you...

This example dataset based trainer also does expert signal recollection, so that is why I didnt do a PR, will let it to you to decide which parts make sense...

### Issue link https://issues.hibernatingrhinos.com/issue/RavenDB-19176 ### Type of change - Bug fix

Corax
v5.4