Sebastian Raschka comments

Results 821 comments of


                                            Sebastian Raschka

Implement LoRA for efficient finetuning

Implemented all the suggestions @carmocca . Should be good to review.

Implement LoRA for efficient finetuning

Arg, it all works fine with StableLM. But I just noticed that this causes issues with Falcon. ``` size mismatch for transformer.h.20.attn.attn.weight: copying a param with shape torch.Size([4672, 4544]) from...

Implement LoRA for efficient finetuning

I just noticed this also needs the `ds_config` for deepspeed. Will add this to the PR shortly

Implement LoRA for efficient finetuning

Should we also change this to FSDP before merging @carmocca or figure it out later?

Implement LoRA for efficient finetuning

Besides FSDP and Falcon, everything should be addressed now. Thanks for the thorough review!

CUDA Out of Memory for the Falcon 7B model on A100 80GB GPU

@k21993 LoRA with Falcon 7B should work on a single GPU with ~16 Gb. If not, you can change the `micro_batch_size = 4` to `micro_batch_size = 1` (it only affects...

CUDA Out of Memory for the Falcon 7B model on A100 80GB GPU

That's weird, here are the complete settings I used https://github.com/rasbt/LLM-finetuning-scripts/blob/main/lit-benchmarks/falcon-7b/finetune/lora.py via ``` python finetune/lora.py --checkpoint_dir checkpoints/tiiuae/falcon-7b/ ``` the peak memory use was 16.97 according to ```python print(f"Memory used: {torch.cuda.max_memory_reserved() /...