Matheus M. Centa comments

Results 12 comments of


                                            Matheus M. Centa

trafficstars

To be, or not to be A2C

Hello! Hopefully, I can answer these questions :) 1. REINFORCE with Baseline can use any state-dependent baseline (you don't even have to use advantages, you can use the Q values...

[WIP] (feat) Seeding torch & rlberry

I was thinking about recording this metadata by using Python's `logging` library since it has all the features we need. I also noticed today that you might need to fork...

Support environments on rlberry?

You're right, I hadn't thought about the use case of debugging agents. Currently, we have: - `benchmarks`: these seem to be toy problems for exploration and generalization, which I think...

Add option for pretrained embeddings

> Thanks for working on this! It would be really helpful if you can add the usage of the new argument in README. Also, can we also test it by:...

Add option for pretrained embeddings

> For dimension match: yeah, I think we should assert that the loaded vectors have the same dimension as `--representation_size`, otherwise just abort the program. I was thinking about disabling...

Add option for pretrained embeddings

> Also, how do we handle the case when vocabulary in the pre-trained embedding does not match the list of graph nodes? I'm not sure. I thought that it would...

Add option for pretrained embeddings

Thanks for taking the time to help me out! I am kind of taking some time to study for my finals right now, but I will be back soon to...

Add option for pretrained embeddings

I'm back from finals and vacations! I just implemented two of the improvements we talked about, and I wanted your opinion on this next one: the way the code is...

Fix -inconsistency of layers/net_arch usage in cnn policy between different algorithms

Hello, I just ran into this inconsistency while implementing a project this week and we basically copied the code from the FeedForwardPolicy from DQN to adapt the code for our...

Fix -inconsistency of layers/net_arch usage in cnn policy between different algorithms

@Miffyli Now that I read the issue a second time, I don't think it is worth it to change the behavior of the `layers` parameter now that v3 is on...