rlpyt icon indicating copy to clipboard operation
rlpyt copied to clipboard

Reinforcement Learning in PyTorch

Results 64 rlpyt issues
Sort by recently updated
recently updated
newest added

I have some performance issues with the sequence buffers. I have traced it to the extract_sequence function in rlpyt/utils/misc.py. It's implemented with a loop over all batch elements. This seems...

It would be super useful for me to see an example of how to use a custom gym environment. Is there an example of this somewhere? The problem with built-in...

Hi @juliusfrost if you have a minute, could you please help figure out why the codecov tests are failing on my last commit, which was a tiny one-line change? It...

Hi, I'm trying to run the R2D1 asynchronous alternating code that came with rlpyt on breakout. I'm wondering if anyone's had success so far in replicating DeepMind's R2D2 benchmark on...

Hello and thanks for a great repo! I have a question concerning retaking the training of an agent. Let's say I have an agent (custom DqnAgent) and it is trained...

I tried the following `python3 example_1.py --cuda_idx=0` This run successfully and I could see that it was using the GPUs. When I tried the following `nvprof --print-gpu-trace python3 example_1.py --cuda_idx=0`...

Hello @astooke , great work with this wonderful library. Working with DQN/Cat_DQN, it seems that the method `__call__` of a `DqnAgent`/`CatDqnAgent`/`R2d1Agent` will transfer the result of their computation on cpu....

I've been trying to set up a multiworker Rainbow DQN baseline for procgen, similar to what's described in [Leveraging Procedural Generation for Benchmarking Reinforcement Learning](https://arxiv.org/abs/1912.01588). This is roughly how I'm...

On example_1, if instead of seeing a minibatch of 4 images per time step for pong, we see only one, the first time the buffer is updated, the program crashes:...

Hi, guys. Love the idea for this. Read the paper, very interesting. Lots of nice design went into this, and I'm eager to try it. How would I add some...