Sebastian Raschka
Sebastian Raschka
Hi there! Image/video is not supported but I can surely add a Trainer recipe some time. Thanks for suggesting!
Thanks for the feedback. It does work on 4 x L4s, which have 24 Gb each. I can see that the usage is around 22-24 GB. Other than trying a...
It was on each GPU. I think that it uses substantially less RAM than 22 x 4 in total though; it might be that it works just fine on a...
Ah yes, `litgpt finetune ...` uses LoRA by default. For full finetuning, it's `litgpt finetune_full ...`
Hm, I haven't had any issues with that recently. But there have been a couple of changes in the last few days. Hm. I assume it's the same issue with...
@KOVVURISATYANARAYANAREDDY I just tried it and it works fine for me with Llama 2 7B: ``` (qlora) sebastian@hyperplane1:~/Developer/prs/debug/lit-gpt$ python finetune/lora.py --precision "bf16-true" --quantize "bnb.nf4" --checkpoint_dir checkpoints/meta-llama/Llama-2-7b-hf/ {'eval_interval': 100, 'save_interval': 100,...
Which dataset are you using? I was using Alpaca, which has relatively short contexts
Just for debugging purposes, what is your memory usage if you use the default lora.py script with microbatch size 1 on Alpaca so that we can compare to my results...
Interesting, so the problem is the longer contexts then?
I see. Hm, that's interesting. So on small contexts, bnb.nf4 performs better, but on longer contexts bnb.nf4 performs worse?