Carlos Mocholí

Results 428 comments of Carlos Mocholí

I have a fix in #171 that will reduce the memory requirements for fine-tuning and training

@k21993 the fix above also applies to lora

I merged #173, that should fix the FLOPs counter issue. I'll try replicating the sequence length issues you are seeing now

Hey all. Using current main, here's what I'm calling: `python finetune/adapter.py --checkpoint_dir checkpoints/tiiuae/falcon-7b --precision bf16-true` with `micro_batch_size=1` I get a constant ~16GB use. It might seem to slowly creep up,...

I merged #178 which should be a small decrease in memory usage. I'll also be adding #182 which includes a change so that the longest alpaca sequence is loaded first,...

Looks like an issue with the model instantiation. Can you pull `main` and call `scripts/convert_hf_checkpoint.py` again?

The same technique should work on Falcon, there's nothing substantially different in how the model is pre-trained.

FSDP also fails with a similar error. This is because we are accessing the lora parameters in `.train()` instead of `.forward()`: https://github.com/Lightning-AI/lit-llama/blob/main/lit_llama/lora.py#L270-L273 @awaelchli suggested removing these calls from the fine-tuning...

The above issue should be just for LoRA. Is the error exactly the same for Adapter?

Hi. Unfortunately, Python 3.7 is not supported as the message indicates. You'll have to upgrade to 3.8, 3.9, or 3.10. See https://pytorch.org/blog/deprecation-cuda-python-support and https://endoflife.date/python