sheeprl
sheeprl copied to clipboard
Distributed Reinforcement Learning accelerated by Lightning Fabric
Hi everyone, in [this branch](https://github.com/Eclectic-Sheep/sheeprl/tree/feature/compile) one can use `torch.compile` to compile the Dreamer-V3 agent. In particular: * in the `sheeprl/configs/algo/dreamer_v3.yaml` one can decide what to compile and which arguments to...
As we do in the [coupled version of SAC](https://github.com/Eclectic-Sheep/sheeprl/blob/76b21b7f52fe16bbbcc845b7ed0371d3f2c39d30/sheeprl/algos/sac/sac.py#L295), where we gather data from all ranks before running a distributed update, we should add the possibility for all Dreamer's implementaation...
I am thinking of using sheeprl as the base for my RL experiments! My work usually builds off of DQN-type algorithms: in increasing level of complexity, off of DDQN, Rainbow,...
Starting from the available pseudocode, do the following: - [x] Verify available infrastructure (buffer, models) - [x] Remove multiplayer support - [ ] Create config file - [ ] Remove...
Write a how to file in markdown to specify how to contribute to the repo
`per_layer_ortho_init_weights` function is developed but never used. We should add the option to activate it when a neural network is created.
Hi, when I run the code `python sheeprl.py exp=dreamer_v3 env=mujoco env.id=Walker2d-v4 algo.cnn_keys.encoder=[rgb]`, the following is the output: Rank-0: policy_step=788, reward_env_2=0.31653469800949097 Rank-0: policy_step=828, reward_env_2=-1.8613801002502441 Rank-0: policy_step=828, reward_env_3=-0.6631340384483337 Rank-0: policy_step=840, reward_env_0=-3.4890027046203613 Rank-0:...
I tried; "sheeprl exp=dreamer_v3 env=gym env.id=CartPole-v1" this one and i got "ValueError: you tried to log -1 which is currently not supported. Try a dict or a scalar/tensor. Set the...
## Summary This PR adds various env wrappers: * [ClipAction](https://gymnasium.farama.org/api/wrappers/action_wrappers/#gymnasium.wrappers.ClipAction) * NormalizeObservationWrapper: inspired by the [NormalizeObservation](https://gymnasium.farama.org/api/wrappers/observation_wrappers/#gymnasium.wrappers.NormalizeObservation), applies normalization to `Dict` spaces * [NormalizeReward](https://gymnasium.farama.org/api/wrappers/reward_wrappers/#gymnasium.wrappers.NormalizeReward) * Observation clip: clip the observations in...
I cannot _sheeprl-eval_ my trained model, since the keys in the world model's state_dict have different names: Stacktrace Error executing job with overrides: ['checkpoint_path=/home/drt/Desktop/sheeprl/sheeprl/logs/runs/dreamer_v3/PyFlyt/2024-06-23_19-34-31_dreamer_v3_PyFlyt_42/version_0/checkpoint/ckpt_730000_0.ckpt', 'fabric.accelerator=gpu', 'env.capture_video=True', 'seed=52'] Traceback (most recent...