coax issues

WIP: add type annotations

This PR adds type annotations to `coax`. Closes https://github.com/coax-dev/coax/issues/13

Add Type Annotations

This issue tracks the progress of adding type annotations to `coax`. - [ ] `_core` - [ ] `experience_replay` - [ ] `model_updaters` - [ ] `policy_objectives` - [ ]...

frederikschubert

enhancement

Convert Numpy Docstrings to Google Style

This issue tracks the progress of converting the [numpy style](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard) docstrings to the more concise [Google style](https://google.github.io/styleguide/pyguide.html#381-docstrings). - [ ] `_core` - [ ] `experience_replay` - [ ] `model_updaters` -...

frederikschubert

enhancement

PPOClip grad update seems to cause inf update

3

**Describe the bug** Hey Kris, love your framework! Working with a custom environment, and your discrete action unit test works perfect locally. Don't spend much time investigating this yet, just...

glmcdona

Add DeepMind Control Suite Example

This PR is a rework of https://github.com/coax-dev/coax/pull/26 and adds an example for using `SAC` on the `Walker.walk` task from the DeepMind Control Suite. Depends on https://github.com/coax-dev/coax/pull/27 and https://github.com/coax-dev/coax/pull/28

frederikschubert

Update to gym==0.26.x

This PR updates the requirements, wrappers and examples to the new API introduced by `gym==0.26.0`. Depends on https://github.com/coax-dev/coax/pull/27

frederikschubert

Update to new Jax API

This PR resolves some warnings connected to deprecations in the Jax API.

frederikschubert

Example of using this lib for RLHF?

Just wondering if there are any example of using this lib for implement RLHF (Reinforcement Learning from Human Feedback)? Inspired by: https://openai.com/blog/chatgpt ![image](https://user-images.githubusercontent.com/6988036/229871768-341d1b74-a1ab-4ac8-815a-b47090e3f4e7.png) Many thanks for any help! :)

asmith26

question

Recurrent Experience Replay

3

**Is your feature request related to a problem? Please describe.** It seems that the implemented replay buffers only operate over transitions, with no ability to operate over entire sequences. This...

smorad

enhancement

MiniMax Algorithm?

1

How would you implement a minimax q-learner with coax? Hi there! I love the package and how accessible it is to relative newbies. The tutorials are pretty great and the...

flaport

question

coax
coax copied to clipboard

Metadata

WIP: add type annotations

Add Type Annotations

Convert Numpy Docstrings to Google Style

PPOClip grad update seems to cause inf update

Add DeepMind Control Suite Example

Update to gym==0.26.x

Update to new Jax API

Example of using this lib for RLHF?

Recurrent Experience Replay

MiniMax Algorithm?

← Metadata

Owner

Metadata

coax coax copied to clipboard

Metadata

← Metadata

Owner

Metadata

coax
coax copied to clipboard