rlpyt issues

How to use on Windows？

What should I do if I want to use this framework on Windows？

Why DQN puts predicted q value and target q value on cpu?

see https://github.com/astooke/rlpyt/blob/f04f23db1eb7b5915d88401fca67869968a07a37/rlpyt/agents/dqn/dqn_agent.py#L29 The predicted q value and target q value are calculated on GPU and then be put on cpu. Consequently, the dqn loss is calculated on cpu. I'm confused...

yueyang130

EnvSteps vs CumSteps for ATC

1

Upon running the training script for UL+RL [link](https://github.com/astooke/rlpyt/blob/master/rlpyt/ul/experiments/rl_from_ul/scripts/dmcontrol/train/dmc_sac_from_ul_serial.py), I get two kinds of step metrics: CumSteps (eg 23,000) and EnvSteps (184,000 for the corresponding CumSteps). The reward at this snapshot...

bmazoure

Clarifications on how to use ul directory?

1

I would like to use code in the directory ```rlpyt/rlpyt/ul/```. However, the code does not appear to work out of the box. For example, some paths in authors filesystem are...

snailrowen1337

Code for Responsive Safety in RL by PID Lagrangian Methods

1

The paper mentions https://github.com/astooke/safe-rlpyt (404) This repo mentions the code in the commits but i don't find anything in the head. Where can i find an official implementation?

manuel-delverme

[Bug report] Parallel sampler hanging on shutdown

Hi, I recently started using this wonderful library, but have been occasionally experiencing a small quality-of-life issue where `parallel.base.ParallelSamplerBase.shutdown` hangs after all the workers have finished, but the worker processes...

joel99

Clarification about prior_action in SAC

Hi, I noticed that in `SAC` implementation, an `action_prior` is introduced at init: ``` if self.action_prior == "uniform": prior_log_pi = 0.0 elif self.action_prior == "gaussian": prior_log_pi = self.action_prior_distribution.log_likelihood( action, GaussianDistInfo(mean=torch.zeros_like(action)))...

0xJchen

KeyError: 'action'

4

Hi, I tried launching r2d1 on atari using _atari_r2d1_async_alt_ and I get this error: ``` call string: taskset -c 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 /home/mila/n/nekoeiha/.conda/envs/rlpyt/bin/python /home/mila/n/nekoeiha/MILA/rlpyt/rlpyt/experiments/scripts/atari/dqn/launch/pabti/../../train/atari_r2d1_async_alt.py 0slt_24cpu_4gpu_0hto_1ass_2sgr_1alt /home/mila/n/nekoeiha/MILA/rlpyt/data/local/20200824/161821/atari_r2d1_async_alt/gravitar 0 async_alt_pabti Unable to import tensorboard...

hnekoeiq

Hanging Manager or worker

Hi Adam, when both the manager and the worker seem to just be staring each other down, nothing much will happen. I have cobbled together a main program here using...

deepdad

Where are the rewards and plots saved, if at all by default ?

1

I noticed the data is supposed to be stored inside the rlpyt/data/local folder. However I dont see any rewards or plots or checkpoints of networks or anything.

csingh27

rlpyt
rlpyt copied to clipboard

Metadata

How to use on Windows？

Why DQN puts predicted q value and target q value on cpu?

EnvSteps vs CumSteps for ATC

Clarifications on how to use ul directory?

Code for Responsive Safety in RL by PID Lagrangian Methods

[Bug report] Parallel sampler hanging on shutdown

Clarification about prior_action in SAC

KeyError: 'action'

Hanging Manager or worker

Where are the rewards and plots saved, if at all by default ?

← Metadata

Owner

Metadata

rlpyt rlpyt copied to clipboard

Metadata

← Metadata

Owner

Metadata

rlpyt
rlpyt copied to clipboard