ptan
ptan copied to clipboard
PyTorch Agent Net: reinforcement learning toolkit for pytorch
Hello I am finding it very difficult to install against the previous PyTorch 0.4.0, as it is now released at 1.0. Is it possible to support PyTorch 1.0 and update...
Right now there is no way to actually fill the intitial replay buffer with random actions
For example when I run a2c.py -r "runs/a2c/a2c_cartpole.ini" tons of errors pop up. Regardless I like that you've implemented a lot of algorithms and put them here. It's very useful...
https://blog.openai.com/openai-baselines-dqn/ *... In the DQN Nature paper the authors write: “We also found it helpful to clip the error term from the update [...] to be between -1 and 1.“....
It seems that the examples under ptan/samples are outdated. For instance, the code for creating agent in dqn_expreplay.py does not match the current definition `agent = ptan.agent.DQNAgent(model, action_selector), cuda=cuda_enabled)` While...
First, thanks for the great work! I tried the DQN Speedup files and was able to get 01 and 02 to run (with about 50fps on an GTX 1070), but...
removed deprecated feature 'torch.autograd.Variable' from /samples/rainbow/lib/common and also changed the loss function from nn.MSELoss() -> nn.functional.mse_loss()
https://pytorch.org/docs/stable/distributions.html score functions and categorical sampling is already implemented in pytorch, using numpy should be discouraged. policy network should output a probability distribution
https://github.com/Shmuma/ptan/blob/84d349225f15a02164f28586b50cf94ee726eacc/ptan/experience.py#L497
https://github.com/Shmuma/ptan/blob/84d349225f15a02164f28586b50cf94ee726eacc/samples/rainbow/lib/common.py#L88