modular_rl issues

Line search does not check KL constraint satisfaction

2

The TRPO paper (Appendix C) claims that "we use a line search to ensure improvement of the surrogate objective and satisfaction of the KL divergence constraint". However, in the current...

zhihanyang2022

Problem in net.add(ConcatFixedStd())

2

Hi there, I am trying to use your code to train some new pybullet environment. Here is my Pip log: Keras (2.0.2) Markdown (2.6.11) mock (2.0.0) numpy (1.13.3) pbr (3.1.1)...

larakim

python 3 support?

currently using python 3 gives: ``` File "run_pg.py", line 44 print "*********** Iteration %i ****************" % COUNTER ^ SyntaxError: Missing parentheses in call to 'print' ```

hughperkins

Will dropout break out the final loss of ppo algorithm?

1

If I add dropout layer to model, will it be a bad idea? Any experiments there?

ppaanngggg

Wlast.set_value(Wlast.get_value(borrow=True)*0.1)

4

Hi John, I have read your TRPO paper and I'm trying to reproduce the Fisher-Vector Product calculation function in C. Line 36-37 in agentzoo.py make me confused. I copy the...

sjshao09

Why Using timesteps in the evaluation of the value function?

Hello John, After reading your paper on TRPO and view your code on GitHub, I am a little bit confused on steps regarding the prediction of value functions. Here, you...

afansi

'Dense' object has no attribute 'W'

5

Hi there, I'm trying to reproduce the results. But when running the code, I first ran into the Monitor error which caused by the updates of the gym environments. And...

jiadingfang

modular_rl
modular_rl copied to clipboard

Metadata

Line search does not check KL constraint satisfaction

Problem in net.add(ConcatFixedStd())

python 3 support?

Will dropout break out the final loss of ppo algorithm?

Wlast.set_value(Wlast.get_value(borrow=True)*0.1)

Why Using timesteps in the evaluation of the value function?

'Dense' object has no attribute 'W'

Atari training and LBFGS gpu memory overhead

Error when using saved weights to continue learning

Running TRPO with RNN

← Metadata

Owner

Metadata

modular_rl modular_rl copied to clipboard

Metadata

← Metadata

Owner

Metadata

modular_rl
modular_rl copied to clipboard