Stoix issues

Results 15 Stoix issues

Sort by recently updated

Add stochastic muzero

## What? Added minimal support to stochastic muzero by issue #77. ## Why? To be able to train stochastic environments like 2048, poker, ... ## How? Added Afterstate and Encoder...

ipsec

[FEATURE] Add stochastic muzero implementation

Add stochastic muzero implementation - [paper](https://openreview.net/pdf?id=X6D9bAHhBQ1) and the [pseudocode](https://gist.github.com/Mononofu/7548d8aa4bf94e12bc7eb7662fd60b56) With this improved version of muzero the stoic could be able to train stochastic environments like the 2048 game and poker...

ipsec

enhancement

Roadmap

feat: add discrete sac and rename continuous sac

## What? Add discrete SAC

EdanToledo

[FEATURE] Evaluate at the beginning of training before any learning has occured

EdanToledo

enhancement

good first issue

[FEATURE] Generalise win_rate to solve_rate allowing environment config to specify desired reward for "solved"

EdanToledo

enhancement

[FEATURE] Support Multi-host systems, not only multi-device

### Feature Implement correct set up for multi-host systems as well as the current multi-device support. ### Proposal This involves using local devices and setting seeds appropriately using process ids.

EdanToledo

enhancement

good first issue

[FEATURE] Add support for efficient recurrent models

### Feature [Revisiting Recurrent Reinforcement Learning with Memory Monoids](https://arxiv.org/abs/2402.09900) provides a method to combine recurrent models with standard, nonrecurrent RL losses. This should provide support for S5, LRU, FFM, Linear...

smorad

enhancement

Roadmap

[INVESTIGATION] Using AffineTanhNormal Head instead of multivariate head improves MPO performance but requires slight changes to loss function

Just leaving this here as a reminder to change it properly when I have time.

EdanToledo

[FEATURE] Include command-line interface

### Problem In order to run a model, we need to specify the exact `.py` of the system model i.e `python stoix/systems/ff_ppo.py`. ### Solution It would be much easier if...

gregfurman

enhancement

[BUG] Jax - Flax compatibility error

### Describe the bug Hello! When making the Dockerfile, I get the error `Cannot import name 'linear_util' from 'jax'` when running examples. This seems to be due to the incompatibility...

thomashirtz

bug

Stoix
Stoix copied to clipboard

Metadata

Add stochastic muzero

[FEATURE] Add stochastic muzero implementation

feat: add discrete sac and rename continuous sac

[FEATURE] Evaluate at the beginning of training before any learning has occured

[FEATURE] Generalise win_rate to solve_rate allowing environment config to specify desired reward for "solved"

[FEATURE] Support Multi-host systems, not only multi-device

[FEATURE] Add support for efficient recurrent models

[INVESTIGATION] Using AffineTanhNormal Head instead of multivariate head improves MPO performance but requires slight changes to loss function

[FEATURE] Include command-line interface

[BUG] Jax - Flax compatibility error

← Metadata

Owner

Metadata

Stoix Stoix copied to clipboard

Metadata

← Metadata

Owner

Metadata

Stoix
Stoix copied to clipboard