ReinforcementLearning.jl
ReinforcementLearning.jl copied to clipboard
A reinforcement learning package for Julia
TRPO
this PR implements Trust-Region Policy Optimization, and adds a CartPole experiment for it. to this end, i wrote a few utility functions that are shared amongst policy gradient policies (#737)....
As I was testing MPO on the cartpole environment, I noticed the algorithm was pretty unstable and has trouble stabilizing at the 200 returns policy. I eventually thought about the...
I'm opening this as a draft so discussions are possible early. This implements the MPO algorithm from [this paper](https://arxiv.org/abs/1806.06920) and [its improved version](https://arxiv.org/abs/1812.02256) PR Checklist - [ ] Update NEWS.md?...
The definition for `MultiThreadEnv` exists under `src/algorithms/policy_gradient/multi_thread_env.jl` but that file is not included in `policy_gradient.jl` , `algorithms.jl`, or `ReinforcementLearningZoo.jl`. Is this intentional? After trying to include it in `policy_gradient.jl`, I...
* [ ] [ACER](https://arxiv.org/abs/1611.01224) * [ ] [TRPO](https://arxiv.org/abs/1502.05477) See also John Schulman's [python implementations](https://github.com/joschu/modular_rl)
From my point of view, there should be a tutorial which implements an environment with a continuous action and state space. It took me some time to find out, that...
Hello, I attempted to run SAC from the example experiment provided at https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/blob/v0.10.1/src/ReinforcementLearningExperiments/deps/experiments/experiments/Policy%20Gradient/JuliaRL_SAC_Pendulum.jl (slightly modified for clarity). It is not learning a viable policy, though it runs without error. I...
# Goal Improve the interactions between ReinforcementLearning.jl and other ecosystems in Julia. ## Why is it important? In the early days of developing this package, the main goal is to...
hello! while going through `vpg.jl` i had some odds-and-ends questions. still pretty new to julia and especially Flux.jl, so please bear with me :D 1. i don't understand the point...