Avnish Narayan

Results 10 issues of Avnish Narayan

the new torch 1.5 works in different ways than 1.4 w.r.t to GPU usage. It will utilize the GPU, but will also try to maximize to maximize cpu ussage for...

documentation

Our current behavior is to pickle policies and value/q-functions so that they are pickled in their exact state, whether they are on gpu or not. This is bad because it...

bug
pytorch

Was looking into using the ray sampler with the gpu again this week because of its potential use with the evaluation samplers/meta evaluator. so to recap, what makes it tricky...

feature

the error a contributor got when using the `categoricalgrupolicy` with `TRPO` on the `tf` branch, computing backwards passes was ``` tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'optimize/hx_plain/gradients_hx_plain/ConjugateGradientOptimizer/update_opt_mean_kl/gradients_constraint/policy_1/gru/rnn_2/while_grad/policy_1/gru/rnn_2/while_grad_grad/ConjugateGradientOptimizer/update_opt_mean_kl/gradients_constraint/policy_1/gru/rnn_2/while_grad/policy_1/gru/rnn_2/while_grad_grad': Connecting to invalid output 78 of source...

Currently in sac, train once returns none if the replay buffer doesn't have the minimum number of timesteps in it. This function should still return some value or raise an...

When running the sac example with the plotter enabled, the plotter crashes.

bug

e.g. parameters such as steps_per_epoch, epoch_cycles, etc, and standardize across all agorithms in the codebase A use mode that we have in garage is the ability to control how frequently...

API

Signed-off-by: avnishn change our grad norm clipping logic to match torch's grad norm clipping function ## Why are these changes needed? ## Related issue number ## Checks - [ ]...

Signed-off-by: Avnish Adding ddpg to rllib contrib with an example. ## Why are these changes needed? ## Related issue number ## Checks - [ ] I've signed off every commit(by...

# Introduction Dowel is a tool that the garage Team uses for logging results from our various Reinforcement learning experiments. Dowel can be used to log different types of data...