Edward Beeching
Edward Beeching
Thanks for the PR! Can you add some details about what feature/example this implements.
Hey, not yet. Hopefully tomorrow. Thanks for this.
Hi, I got around to trying this. Unfortunately, the mac build templates don't work on linux. I think we would have to include a build server as part of the...
Hi, thanks for raising this issue. Regarding logging, there should be a log file created in tmp/results/doom_rl/*. Thanks for pointing out about the plotting code, I will add this file...
Thanks for your question. I have added trained models for all scenarios in [saved_models](https://github.com/edbeeching/3d_control_deep_rl/tree/master/saved_models). Note these models were trained with ViZDoom version 1.1.4 (some textures changed in the more recent...
I just realized I provided code for analyzing the attention distribution, not the TSNE. I added the TSNE code [here](https://github.com/edbeeching/3d_control_deep_rl/blob/master/visualization/hidden_state_tsne_analysis.py) again you will need to modify a bit to get...
Hey, did you try the sb3 models in the latest version of the repo? I did not test the rllib export much.
In general, we observe better performance with the full finetune. Although we did not perform a full hyperparameter scan on the lora configs so I am sure improvements can be...
Deepspeed zero3 will shard the model over several GPUs, this should resolve the OOM issues you see. Note we testing on A100 GB GPUS so you may need to tweek...
Hi @alvarobartt sorry for the delay. Yes we are using flash attn. @tcapelle if you have lower GPU memory you can use lora (peft) to perform finetuning.