fsdp_qlora
fsdp_qlora copied to clipboard
Training LLMs with QLoRA + FSDP
The README mentions: ``` The SFTTrainer version has to run with a lower batch size (4 vs 8) so we only do 2 gradient accumulation steps vs 4 in the...
when I tried to train some 'qna' style dataset like knowrohit07/know_sql get this error.
Is training with 1024 or 2048 sequence length feasible using this method?
Thanks for such wonderful work! I see you comment out this line: https://github.com/AnswerDotAI/fsdp_qlora/blob/d7818ec86d17f37db4beef36f80870cbcac37957/train.py#L722 May I ask what is the rationale behind it? Is fsdp_qlora compatible with torch compile?
I think there is a bug in the DoRA implementation as it takes neither `lora_dropout` nor `lora_alpha` into account. These arguments are passed as `*args` to the `__init__` call of...
Add option for local 'custom.jsonl' dataset with llama3 prompt format Add conversion script for merging fsdp model_state_dict with model
Hi, I'm fixed the bug in `Converting the State Dict.ipynb`
Hi there, Just wondering, does this repo support fine-tuning a Vision Language Model (VLM), e.g https://huggingface.co/microsoft/Phi-3.5-vision-instruct? Many thanks for any help, and for this amazing lib!