garage icon indicating copy to clipboard operation
garage copied to clipboard

update documentation on how to use rnns with tf/torch[pending]

Open avnishn opened this issue 3 years ago • 3 comments

the error a contributor got when using the categoricalgrupolicy with TRPO on the tf branch, computing backwards passes was

tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'optimize/hx_plain/gradients_hx_plain/ConjugateGradientOptimizer/update_opt_mean_kl/gradients_constraint/policy_1/gru/rnn_2/while_grad/policy_1/gru/rnn_2/while_grad_grad/ConjugateGradientOptimizer/update_opt_mean_kl/gradients_constraint/policy_1/gru/rnn_2/while_grad/policy_1/gru/rnn_2/while_grad_grad': 
Connecting to invalid output 78 of source node ConjugateGradientOptimizer/update_opt_mean_kl/gradients_constraint/policy_1/gru/rnn_2/while_grad/policy_1/gru/rnn_2/while_grad which has 78 outputs. 

Try using tf.compat.v1.experimental.output_all_intermediates(True)

avnishn avatar Dec 11 '20 15:12 avnishn

@krzentner

avnishn avatar Dec 11 '20 15:12 avnishn

was able to fix the contributor's by adding the following argument to the optimizer of trpo:

optimizer_args=dict(hvp_approach=FiniteDifferenceHVP(
                            base_eps=1e-5))

is there a reason why we would need this? Is this specific to trpo, and if so, can we modify trpo to have this by default?

avnishn avatar Dec 11 '20 17:12 avnishn

If CG optimizer can't be used with RNNs (I don't think that's actually the case), we should detect that and raise an error.

ryanjulian avatar Dec 11 '20 18:12 ryanjulian