rlpyt
rlpyt copied to clipboard
Reinforcement Learning in PyTorch
What should I do if I want to use this framework on Windows?
see https://github.com/astooke/rlpyt/blob/f04f23db1eb7b5915d88401fca67869968a07a37/rlpyt/agents/dqn/dqn_agent.py#L29 The predicted q value and target q value are calculated on GPU and then be put on cpu. Consequently, the dqn loss is calculated on cpu. I'm confused...
Upon running the training script for UL+RL [link](https://github.com/astooke/rlpyt/blob/master/rlpyt/ul/experiments/rl_from_ul/scripts/dmcontrol/train/dmc_sac_from_ul_serial.py), I get two kinds of step metrics: CumSteps (eg 23,000) and EnvSteps (184,000 for the corresponding CumSteps). The reward at this snapshot...
I would like to use code in the directory ```rlpyt/rlpyt/ul/```. However, the code does not appear to work out of the box. For example, some paths in authors filesystem are...
The paper mentions https://github.com/astooke/safe-rlpyt (404) This repo mentions the code in the commits but i don't find anything in the head. Where can i find an official implementation?
Hi, I recently started using this wonderful library, but have been occasionally experiencing a small quality-of-life issue where `parallel.base.ParallelSamplerBase.shutdown` hangs after all the workers have finished, but the worker processes...
Hi, I noticed that in `SAC` implementation, an `action_prior` is introduced at init: ``` if self.action_prior == "uniform": prior_log_pi = 0.0 elif self.action_prior == "gaussian": prior_log_pi = self.action_prior_distribution.log_likelihood( action, GaussianDistInfo(mean=torch.zeros_like(action)))...
Hi, I tried launching r2d1 on atari using _atari_r2d1_async_alt_ and I get this error: ``` call string: taskset -c 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 /home/mila/n/nekoeiha/.conda/envs/rlpyt/bin/python /home/mila/n/nekoeiha/MILA/rlpyt/rlpyt/experiments/scripts/atari/dqn/launch/pabti/../../train/atari_r2d1_async_alt.py 0slt_24cpu_4gpu_0hto_1ass_2sgr_1alt /home/mila/n/nekoeiha/MILA/rlpyt/data/local/20200824/161821/atari_r2d1_async_alt/gravitar 0 async_alt_pabti Unable to import tensorboard...
Hi Adam, when both the manager and the worker seem to just be staring each other down, nothing much will happen. I have cobbled together a main program here using...
I noticed the data is supposed to be stored inside the rlpyt/data/local folder. However I dont see any rewards or plots or checkpoints of networks or anything.