Sebastian Raschka

Results 818 comments of Sebastian Raschka

Hi there! Image/video is not supported but I can surely add a Trainer recipe some time. Thanks for suggesting!

Thanks for the feedback. It does work on 4 x L4s, which have 24 Gb each. I can see that the usage is around 22-24 GB. Other than trying a...

It was on each GPU. I think that it uses substantially less RAM than 22 x 4 in total though; it might be that it works just fine on a...

Ah yes, `litgpt finetune ...` uses LoRA by default. For full finetuning, it's `litgpt finetune_full ...`

Hm, I haven't had any issues with that recently. But there have been a couple of changes in the last few days. Hm. I assume it's the same issue with...

@KOVVURISATYANARAYANAREDDY I just tried it and it works fine for me with Llama 2 7B: ``` (qlora) sebastian@hyperplane1:~/Developer/prs/debug/lit-gpt$ python finetune/lora.py --precision "bf16-true" --quantize "bnb.nf4" --checkpoint_dir checkpoints/meta-llama/Llama-2-7b-hf/ {'eval_interval': 100, 'save_interval': 100,...

Which dataset are you using? I was using Alpaca, which has relatively short contexts

Just for debugging purposes, what is your memory usage if you use the default lora.py script with microbatch size 1 on Alpaca so that we can compare to my results...

Interesting, so the problem is the longer contexts then?

I see. Hm, that's interesting. So on small contexts, bnb.nf4 performs better, but on longer contexts bnb.nf4 performs worse?