garage icon indicating copy to clipboard operation
garage copied to clipboard

A toolkit for reproducible reinforcement learning research.

Results 108 garage issues
Sort by recently updated
recently updated
newest added

Hi, your code is a nice work but I am confused about some details of MAML Pytorch. In inner loop you update params of tasks and save it in `all...

Test if we can simplify the CI a great deal by using the github actions container option

Was looking into using the ray sampler with the gpu again this week because of its potential use with the evaluation samplers/meta evaluator. so to recap, what makes it tricky...

feature

See https://github.com/pytorch/pytorch/issues/975 for more info PyTorch TRPO appears 50% slower than TF. Not sure about PPO, but I expect the wall-clock time gap will be the same. To fix this...

pytorch

Thank you for the clean and well-documented library! I am trying to use MAML for 2D navigation but have been achieving suboptimal policies. In particular, rollouts from the (adapted) trained...

In the source code of [cem.py](https://github.com/rlworkgroup/garage/blob/master/src/garage/np/algos/cem.py), there is an inconsistent setting between the [Definition](https://github.com/rlworkgroup/garage/blob/master/src/garage/np/algos/cem.py#L68) and [Call](https://github.com/rlworkgroup/garage/blob/master/src/garage/np/algos/cem.py#L166) of the function ```_sample_params```. The detail is presented below: In Line 166, [the call...

Extend the `Environment` API to support setting environment library specific seeds. Tasks: - [x] Extend `Environment` interface - [x] Set seeds for Gym envs - [x] Ensure seeds are set...

the error a contributor got when using the `categoricalgrupolicy` with `TRPO` on the `tf` branch, computing backwards passes was ``` tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'optimize/hx_plain/gradients_hx_plain/ConjugateGradientOptimizer/update_opt_mean_kl/gradients_constraint/policy_1/gru/rnn_2/while_grad/policy_1/gru/rnn_2/while_grad_grad/ConjugateGradientOptimizer/update_opt_mean_kl/gradients_constraint/policy_1/gru/rnn_2/while_grad/policy_1/gru/rnn_2/while_grad_grad': Connecting to invalid output 78 of source...

Currently in sac, train once returns none if the replay buffer doesn't have the minimum number of timesteps in it. This function should still return some value or raise an...