Blake comments

Results 121 comments of


                                            Blake

LLama 2 Flash Attention Patch Not Working For 70B

For anyone else wondering why: https://github.com/philschmid/deep-learning-pytorch-huggingface/pull/30

LLama 2 Flash Attention Patch Not Working For 70B

@philschmid Thanks for getting back to me and thanks for your work and blog post! I eventually saw that. I was able to use your work to get finetuning working...

LLama 2 Flash Attention Patch Not Working For 70B

@philschmid The old 70B patch, while it supports a backward and forward pass, still has issues. When I try to generate text with the model after training with Qlora, I...

LLama 2 Flash Attention Patch Not Working For 70B

This repo here has a working implementation for all models, that being 7B, 13B, and 70B. It's licensed as GPL 3.0, but for my repo which is APGL, that is...

Thinking tag removal is backwards. (Plus fix)

Cool thanks! Feel free to close this issue once you add the fix

LLama 3/3.1 70B Outputting "!!!!!!"; Shorter Context

This just happened with the 8B model too. I am thinking it may have something to do with bits and bytes but I am not sure.

LLama 3/3.1 70B Outputting "!!!!!!"; Shorter Context

Happening when not using quantization as well. Still pseudo-random.

LLama 3/3.1 70B Outputting "!!!!!!"; Shorter Context

Rebooting somtimes helps. Maybe its a hardware issue.

LLama 3/3.1 70B Outputting "!!!!!!"; Shorter Context

It didn't happen on previous versions so if it is hardware related, it's a recent development or it's a bug introduced.

Accelerate 0.30.0 Breaks FSDP QLora

So use latest accelerate and install peft from main? I will do the following: pip install transformers bitsandbytes trl accelerate pip install git+https://github.com/huggingface/peft.git I will let you know