Michele Milesi comments

Results 13 comments of


                                            Michele Milesi

Unexpected behaviour in inference with merged QLoRA weights

Hi @Andrei-Aksionov, thanks for your support. We used the commit with id: 1e5afd6fb5653eddc15aafcae8c20f5222e4e1e3. The only two things we have done are: 1. Comment the line that calls the `merge_lora_weights()` function...

dv3 Imagination notebook re-visited

Hi @anthony0727, 1. The player has the stochastic state that is flattened, as you can see here: https://github.com/Eclectic-Sheep/sheeprl/blob/40035066a55b76fd9f9dc4d92ee5a749e079e6b1/sheeprl/algos/dreamer_v3/agent.py#L655 or https://github.com/Eclectic-Sheep/sheeprl/blob/40035066a55b76fd9f9dc4d92ee5a749e079e6b1/sheeprl/algos/dreamer_v3/agent.py#L686 2. You are right, the reconstructed observation should be incremented...

dv3 Imagination notebook re-visited

Hi @anthony0727, we created a branch for fixing this issue, can you check if it works? (https://github.com/Eclectic-Sheep/sheeprl/tree/fix/dv3-imagination-notebook) Thanks

Pure python training, evaluation and rollout documentation request.

Hi there, @belerico, yes, we can start with something similar to the two examples you mentioned. For the environment part, I think we can try to recycle [this](https://lightning.ai/or-bix-srl/studios/sheeprl-how-to-integrate-super-mario-bros-enviroment?view=public&section=tutorials). Or are...

Michele Milesi

Unexpected behaviour in inference with merged QLoRA weights

dv3 Imagination notebook re-visited

dv3 Imagination notebook re-visited

Pure python training, evaluation and rollout documentation request.

Pure python training, evaluation and rollout documentation request.

Last `N` actions as `mlp_keys` encoder input for `dreamer_v3`

Last `N` actions as `mlp_keys` encoder input for `dreamer_v3`

Last `N` actions as `mlp_keys` encoder input for `dreamer_v3`

Last `N` actions as `mlp_keys` encoder input for `dreamer_v3`

Last `N` actions as `mlp_keys` encoder input for `dreamer_v3`