fsdp_qlora icon indicating copy to clipboard operation
fsdp_qlora copied to clipboard

Training LLMs with QLoRA + FSDP

Results 39 fsdp_qlora issues
Sort by recently updated
recently updated
newest added

This PR enables to load models which contain pytorch model bin format files only (not safetensors format).

Hey, I'm loving the goal of lowering the resource requirements for training! In this paper https://arxiv.org/abs/2403.06504 they claim direct memory access between the GPUNvme Storage is more efficient at swapping...

I had to vary this code here in the Train.py to get it to work on my system # LoRA and DORA modules sys.path.append("./scripts") from scripts.lora import LORA from scripts.dora...

Fix `RuntimeError` in https://github.com/AnswerDotAI/fsdp_qlora/issues/28 once and for all by protecting main code with `if __name__ != '__main__': return` **Background:** * The original issue occurred when the `awq` package was imported...

Hello, I've successfully finetuned Llama-3 8B with QDoRA and am now looking to perform inference using vLLM. Could you provide guidance or scripts on how to merge the QDoRA adapters...

Hi, I met the following error when finetune llama7b model with FSDP+HQQ: ``` Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/spawn.py", line 74, in _wrap fn(i, *args) File "/workspace/fsdp_qlora/train.py", line 723,...

Hi, I tried to finetune a llama7b model with HQQ-LORA using dual GPUs. I found that during "Loading & Quantizing Model Shards", the peak GPU memory usage acheved 35G. What's...

Hello, thank you for the awesome work! Could you please add support for the DeepSeek VL model?

Here's the command I ran: ``` python train.py \ --model_name meta-llama/Llama-2-70b-hf \ --batch_size 1 \ --context_length 1024 \ --precision bf16 \ --train_type hqq_lora \ --use_gradient_checkpointing true \ --use_cpu_offload false \...

I used this script to fine tune LLama 3 (from AnswerAI blog post), what I'm left with is a state dict that I am unable to use to replace layers...