BugReporterZ

Results 16 comments of BugReporterZ

@gardner That appears to fix axolotl not getting installed and running in my case, but there are still issues with training in that memory usage seems unusually high compared to...

Reverting to an axolotl commit mid-December (`5f79b82` but I haven't investigated when issues began exactly), reinstalling packages then uninstalling `flash-attn` and doing `pip install flash-attn=2.3.2` fixes the issue. Training Mistral-7B...

The increased VRAM usage could be possibly related with https://github.com/OpenAccess-AI-Collective/axolotl/issues/1127

I tracked down the issue to `flash-attn` from `pip`. Version 2.3.2 works; the newer one as per `requirements.txt` (2.3.3) causes problems. At the moment I'm on torch 2.0.1, though.

Thanks for replying! Great to learn that there are no inherent issues preventing to combine FlashAttention with QLoRA. With the latest FlashAttention2 promising even further performance improvements, and given that...

Perhaps some of the code from [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) could be used. It's a trainer which employs QLoRA and different attention mechanisms, including FlashAttention. I haven't been able to make FlashAttention work...