astooke

Results 80 comments of astooke

@drozzy @LecJackS If you are making a custom env, it is better to just use the rlpyt base env class (https://github.com/astooke/rlpyt/blob/master/rlpyt/envs/base.py), and follow that interface. No need to go through...

@frankie4fingers that's an unexpected problem! Could you provide more details? The environment should be instantiated separately within each child process in the parallel samplers.

@im-ant Yes the main thing to get C51 working with a custom environment is just to write your own model class. Or maybe your environment has the same observation and...

@frankie4fingers OK thanks for explaining the problem and the quick workaround. I'm still a bit surprised by this, because I've run gym envs in parallel before. And when the child...

Hmm, ok things get a little tricky with prioritized sequence replay. My first suggestion would be to construct the sequence replay buffer with its `batch_T` equal to the longest K...

Wow, very sorry I missed your last question! I am surprised that the code would run so much slower. Possibly it was sampling huge batches, with a lot of extra...

Hmmm, I don't have a full answer for this because it's specifics of one RL problem...but one thing that might help is to clip the actions inside the environment, but...

Hmm, how long is your `config["runner"]["log_interval_steps"]` and how many environments are you using `config["sampler"]["batch_B"]`? If the first of those is small, and the second is large, it could simply be...

Hi, this is an interesting point that could make things easier. They were not intended to be related, but what functionality from PyTorch distributions are you looking for?

Hi! This is because in PyTorch 1.2, the grad norm is returned as a python float. In later versions of PyTorch, it's a pytorch object which requires calling `.item()` to...