Jun Tian issues

Results 70 issues of


                                            Jun Tian

Questions on Contributing Python Demos

Last year I had studied the book [Machine Learning: a Probabilistic Perspective](http://www.cs.ubc.ca/~murphyk/MLbook/). However, with only a little knowledge of Matlab I found it hard to understand the source code. Then...

Add link to the Julia binding

This package is awesome! The APIs are really concise, which makes it very easy to provide a Julia binding with the C API.

Disable formatting multi-line comment inline

``` julia> format_text("f(a;#= b=3 =#, c=4)") "f(a, c = 4)#= b=3 =#" ``` Could we disable formatting here?

bug

low priority

Next Release Plan (v0.11)

135

# Goal Improve the interactions between ReinforcementLearning.jl and other ecosystems in Julia. ## Why is it important? In the early days of developing this package, the main goal is to...

Improve the logging mechanism during training

Currently, in each `policy` or `learner`, we allocate a temp memory to record intermediate data. And to record these data, we need to add an extra hook. There're at least...

Model based reinforcement learning

After taking a look into [facebookresearch/mbrl-lib](https://github.com/facebookresearch/mbrl-lib) . I think our current design is flexible enough to implement most algorithms in it. It's just that we need to standardize some interfaces....

design

Explain current implementation of PPO in detail

Ref: https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/

doc

Combine transformers and RL

This seems like an interesting intersection between NLP and RL. I'll give it a try when I have time. - [Decision Transformer: Reinforcement Learning via Sequence Modeling](https://arxiv.org/abs/2106.01345) - [Reinforcement Learning...

enhancement

RLZoo

Rename some functions to help beginners navigate source code

I only realize this problem very recently. Multiple dispatch seems to be overused here in this package. For example, the `update!` function. I thought it was quite straightforward. When we...

enhancement

Support multiple discrete action space

A2C and PPO can be improved further to support mutiple discrete action space

enhancement

good first issue