Jun Tian
Jun Tian
Most deep rl algorithms in RLZoo assume the state to be an array. However, states of a graph or any other data structure should also be supported out of the...
Another interesting direction. - [Reinforcement Learning for Combinatorial Optimization: A Survey](https://arxiv.org/abs/2003.03600)
This seems like an interesting direction and it may require a specialized workflow. Ref: - [Derivative-Free Reinforcement Learning: A Review](https://arxiv.org/abs/2102.05710) - [BlackBoxOptim.jl](https://github.com/robertfeldt/BlackBoxOptim.jl) - [CMAEvolutionStrategy.jl](https://github.com/jbrea/CMAEvolutionStrategy.jl) - [BayesianOptimization.jl](https://github.com/jbrea/BayesianOptimization.jl) - [Evolutionary.jl](https://github.com/wildart/Evolutionary.jl) - [Evolving...
https://github.com/JuliaReinforcementLearning/ReinforcementLearningCore.jl/blob/63f306d99a6db736a1755a5d1e26f2aa8e8822dc/src/extensions/Zygote.jl#L10 Or maybe make a PR in Flux instead?
In current design of distributed rl, each worker creates an independent model and make predictions separately. A better solution might be that workers on the same node share some common...
We used to have support for Knet.jl in addition to Flux.jl, but it was dropped since [email protected]. The main reason was that Knet.jl is not very easy to extend. However,...
ref: https://arxiv.org/abs/1911.02140 Based on the implementation of IQN, this is relatively easy to support.
I've spent some time working on reimplementing https://github.com/liuanji/WU-UCT . It seems to work well. I'll add some experiments after https://github.com/JuliaReinforcementLearning/ReinforcementLearningZoo.jl/pull/14 gets merged.
https://arxiv.org/abs/1810.09026