sheeprl issues

`torch.compile` agents

9

Hi everyone, in [this branch](https://github.com/Eclectic-Sheep/sheeprl/tree/feature/compile) one can use `torch.compile` to compile the Dreamer-V3 agent. In particular: * in the `sheeprl/configs/algo/dreamer_v3.yaml` one can decide what to compile and which arguments to...

belerico

enhancement

help wanted

Let Dreamers share data for distributed training

As we do in the [coupled version of SAC](https://github.com/Eclectic-Sheep/sheeprl/blob/76b21b7f52fe16bbbcc845b7ed0371d3f2c39d30/sheeprl/algos/sac/sac.py#L295), where we gather data from all ranks before running a distributed update, we should add the possibility for all Dreamer's implementaation...

belerico

enhancement

help wanted

Algorithm Request: more DQN-based approaches

2

I am thinking of using sheeprl as the base for my RL experiments! My work usually builds off of DQN-type algorithms: in increasing level of complexity, off of DDQN, Rainbow,...

samlobel

help wanted

algorithm

Add the muzero algorithm

Starting from the available pseudocode, do the following: - [x] Verify available infrastructure (buffer, models) - [x] Remove multiplayer support - [ ] Create config file - [ ] Remove...

DavideTr8

algorithm

How-to contribute

Write a how to file in markdown to specify how to contribute to the repo

DavideTr8

documentation

Add optional orthogonal initialization to models

`per_layer_ortho_init_weights` function is developed but never used. We should add the option to activate it when a neural network is created.

DavideTr8

enhancement

Issues with running mujoco walker 2d

1

Hi, when I run the code `python sheeprl.py exp=dreamer_v3 env=mujoco env.id=Walker2d-v4 algo.cnn_keys.encoder=[rgb]`, the following is the output: Rank-0: policy_step=788, reward_env_2=0.31653469800949097 Rank-0: policy_step=828, reward_env_2=-1.8613801002502441 Rank-0: policy_step=828, reward_env_3=-0.6631340384483337 Rank-0: policy_step=840, reward_env_0=-3.4890027046203613 Rank-0:...

ruiiu

SheepRL Dreamer v3 - ValueError

3

I tried; "sheeprl exp=dreamer_v3 env=gym env.id=CartPole-v1" this one and i got "ValueError: you tried to log -1 which is currently not supported. Try a dict or a scalar/tensor. Set the...

ogulcankertmen

Add action, reward and obs wrappers

3

## Summary This PR adds various env wrappers: * [ClipAction](https://gymnasium.farama.org/api/wrappers/action_wrappers/#gymnasium.wrappers.ClipAction) * NormalizeObservationWrapper: inspired by the [NormalizeObservation](https://gymnasium.farama.org/api/wrappers/observation_wrappers/#gymnasium.wrappers.NormalizeObservation), applies normalization to `Dict` spaces * [NormalizeReward](https://gymnasium.farama.org/api/wrappers/reward_wrappers/#gymnasium.wrappers.NormalizeReward) * Observation clip: clip the observations in...

belerico

`sheeprl_eval` loading model with different keys

5

I cannot _sheeprl-eval_ my trained model, since the keys in the world model's state_dict have different names: Stacktrace Error executing job with overrides: ['checkpoint_path=/home/drt/Desktop/sheeprl/sheeprl/logs/runs/dreamer_v3/PyFlyt/2024-06-23_19-34-31_dreamer_v3_PyFlyt_42/version_0/checkpoint/ckpt_730000_0.ckpt', 'fabric.accelerator=gpu', 'env.capture_video=True', 'seed=52'] Traceback (most recent...

belerico

sheeprl
sheeprl copied to clipboard

Metadata

`torch.compile` agents

Let Dreamers share data for distributed training

Algorithm Request: more DQN-based approaches

Add the muzero algorithm

How-to contribute

Add optional orthogonal initialization to models

Issues with running mujoco walker 2d

SheepRL Dreamer v3 - ValueError

Add action, reward and obs wrappers

`sheeprl_eval` loading model with different keys

← Metadata

Owner

Metadata

sheeprl sheeprl copied to clipboard

Metadata

← Metadata

Owner

Metadata

sheeprl
sheeprl copied to clipboard