HandyRL issues

(Idea) feature: proportional accept rate during all phases

So far, the adoption rate in the replay buffer has been linear based on `maximum_episodes`, but this means that the earliest episodes will be selected many times before the buffer...

YuriCat

(IMPORTANT) Deep Nash algorithm

Computing Nash equilibrium looks wonderful, but there are some required updates.

YuriCat

(Idea) (Experiment) websocket remote connection

This is just one possible idea, especially for large scale training.

YuriCat

feature: compute rho, c by joint probability

How to define `rho` and `c` has not a clear answer. However, in a game like rock-paper-scissors where the best move depends on the opponent's move, it makes no sense...

YuriCat

(Change Default Outputs) feature: change default learning rate to 3e-6 * sqrt(batch_size)

The learning rate proportional to batch size looks strange.

YuriCat

(WIP) google research football environment

YuriCat

feature: set show flag in game_args

For example: ``` if args.get('show', False): self.env.render() ```

YuriCat

(WIP) update Geister sample

- board view - better neural net? - preparation for piece color estimation>

YuriCat

(Experiment) (Another Style) feature: generalized policy setting log

This is more generalized version.

YuriCat

(Idea) feature: remove entry server and use same worker server to entry

In the future, I'd like to remove `prepare_env` and do entries from each `Gather`, but first I have an idea to remove an useless port opend by the `entry_server`.

YuriCat

HandyRL
HandyRL copied to clipboard

Metadata

(Idea) feature: proportional accept rate during all phases

(IMPORTANT) Deep Nash algorithm

(Idea) (Experiment) websocket remote connection

feature: compute rho, c by joint probability

(Change Default Outputs) feature: change default learning rate to 3e-6 * sqrt(batch_size)

(WIP) google research football environment

feature: set show flag in game_args

(WIP) update Geister sample

(Experiment) (Another Style) feature: generalized policy setting log

(Idea) feature: remove entry server and use same worker server to entry

← Metadata

Owner

Metadata

HandyRL HandyRL copied to clipboard

Metadata

← Metadata

Owner

Metadata

HandyRL
HandyRL copied to clipboard