sheeprl icon indicating copy to clipboard operation
sheeprl copied to clipboard

Distributed Reinforcement Learning accelerated by Lightning Fabric

Results 27 sheeprl issues
Sort by recently updated
recently updated
newest added

Hi everyone, in [this branch](https://github.com/Eclectic-Sheep/sheeprl/tree/feature/compile) one can use `torch.compile` to compile the Dreamer-V3 agent. In particular: * in the `sheeprl/configs/algo/dreamer_v3.yaml` one can decide what to compile and which arguments to...

enhancement
help wanted

As we do in the [coupled version of SAC](https://github.com/Eclectic-Sheep/sheeprl/blob/76b21b7f52fe16bbbcc845b7ed0371d3f2c39d30/sheeprl/algos/sac/sac.py#L295), where we gather data from all ranks before running a distributed update, we should add the possibility for all Dreamer's implementaation...

enhancement
help wanted

I am thinking of using sheeprl as the base for my RL experiments! My work usually builds off of DQN-type algorithms: in increasing level of complexity, off of DDQN, Rainbow,...

help wanted
algorithm

Starting from the available pseudocode, do the following: - [x] Verify available infrastructure (buffer, models) - [x] Remove multiplayer support - [ ] Create config file - [ ] Remove...

algorithm

Write a how to file in markdown to specify how to contribute to the repo

documentation

`per_layer_ortho_init_weights` function is developed but never used. We should add the option to activate it when a neural network is created.

enhancement

Hi, when I run the code `python sheeprl.py exp=dreamer_v3 env=mujoco env.id=Walker2d-v4 algo.cnn_keys.encoder=[rgb]`, the following is the output: Rank-0: policy_step=788, reward_env_2=0.31653469800949097 Rank-0: policy_step=828, reward_env_2=-1.8613801002502441 Rank-0: policy_step=828, reward_env_3=-0.6631340384483337 Rank-0: policy_step=840, reward_env_0=-3.4890027046203613 Rank-0:...

I tried; "sheeprl exp=dreamer_v3 env=gym env.id=CartPole-v1" this one and i got "ValueError: you tried to log -1 which is currently not supported. Try a dict or a scalar/tensor. Set the...

## Summary This PR adds various env wrappers: * [ClipAction](https://gymnasium.farama.org/api/wrappers/action_wrappers/#gymnasium.wrappers.ClipAction) * NormalizeObservationWrapper: inspired by the [NormalizeObservation](https://gymnasium.farama.org/api/wrappers/observation_wrappers/#gymnasium.wrappers.NormalizeObservation), applies normalization to `Dict` spaces * [NormalizeReward](https://gymnasium.farama.org/api/wrappers/reward_wrappers/#gymnasium.wrappers.NormalizeReward) * Observation clip: clip the observations in...

I cannot _sheeprl-eval_ my trained model, since the keys in the world model's state_dict have different names: Stacktrace Error executing job with overrides: ['checkpoint_path=/home/drt/Desktop/sheeprl/sheeprl/logs/runs/dreamer_v3/PyFlyt/2024-06-23_19-34-31_dreamer_v3_PyFlyt_42/version_0/checkpoint/ckpt_730000_0.ckpt', 'fabric.accelerator=gpu', 'env.capture_video=True', 'seed=52'] Traceback (most recent...