Avnish Narayan issues

Results 10 issues of


Avnish Narayan

Update Maximizing Resource Utilization with new Torch Information

the new torch 1.5 works in different ways than 1.4 w.r.t to GPU usage. It will utilize the GPU, but will also try to maximize to maximize cpu ussage for...

documentation

When pickling torch algorithms, move torch modules to CPU before pickling

Our current behavior is to pickle policies and value/q-functions so that they are pickled in their exact state, whether they are on gpu or not. This is bad because it...

bug

pytorch

Support Both GPU and CPU With The Ray Sampler

Was looking into using the ray sampler with the gpu again this week because of its potential use with the evaluation samplers/meta evaluator. so to recap, what makes it tricky...

feature

update documentation on how to use rnns with tf/torch[pending]

the error a contributor got when using the `categoricalgrupolicy` with `TRPO` on the `tf` branch, computing backwards passes was ``` tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'optimize/hx_plain/gradients_hx_plain/ConjugateGradientOptimizer/update_opt_mean_kl/gradients_constraint/policy_1/gru/rnn_2/while_grad/policy_1/gru/rnn_2/while_grad_grad/ConjugateGradientOptimizer/update_opt_mean_kl/gradients_constraint/policy_1/gru/rnn_2/while_grad/policy_1/gru/rnn_2/while_grad_grad': Connecting to invalid output 78 of source...

Avnish Narayan

Update Maximizing Resource Utilization with new Torch Information

When pickling torch algorithms, move torch modules to CPU before pickling

Support Both GPU and CPU With The Ray Sampler

update documentation on how to use rnns with tf/torch[pending]

Rework logic for filling and checking replay buffer in torch sac, dog, and td3

Plotter isn't working with torch policies.

Standardize off-policy RL hyperparameters across the codebase

[RLlib] Use torch's implementation of grad norm clippling

[RLlib-contrib] ddpg

Robust handling of inconsistent TabularInput keys