ipex-llm
ipex-llm copied to clipboard
issue with qlora fine-tuning on Flex GPU
Hi,
I am trying to use the Qlora code as provided in the repo on a Sapphire Rapids, Flex GPU machine.
I was able to run the qlora_finetuning.py without any error.
But the export_merged_model.py is giving me this error:
The command I used to merge the model: python ./export_merged_model.py --repo-id-or-model-path < path to llama-2-7b-chat-hf> --adapter_path ./outputs/checkpoint-200 --output_path ./outputs/checkpoint-200-merged
OS : Ubuntu 22
This is my training info:
Hi, @tsantra Would you mind trying it again after pip install accelerate==0.23.0
?
@rnwang04 Thank you. It worked after installing accelerate=0.23.0
I have two questions:
- Is QLora fine-tuning supported on CPU?
- The code here https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/QLoRA-FineTuning/export_merged_model.py , shows device_map={"": cpu}, so which part of the code is running on the Flex GPU?
Hi @tsantra ,
- Yes, it's supported on CPU, we will provide an official CPU example later.
- After you got the merged model (for example
checkpoint-200-merged
), you can use it as a normal huggingface transformer model to do inference on Flex GPU, like https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2
Hi @tsantra , QLoRA CPU example is updated here(https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/QLoRA-FineTuning)
Hi @rnwang04 , thank you for your reply!
Are you using any metric to check for model accuracy after QLora finetuning. I had used my custom dataset for finetuning and my inference results are not good. Model is hallucinating a lot. Do you have any BKM for fine-tuning?
Are you also using any profiler to check for GPU memory usage? Do you have any suggestion?
Had closed by mistake.
@rnwang04 GPU finetuning suddenly stopped working and gave Seg Fault.
@rnwang04 GPU finetuning suddenly stopped working and gave Seg Fault.
Hi @tsantra , have you ever run GPU finetuning successfully ? or you always meet this error? If you ever run GPU finetuning successfully before, do you make any changes to your script or env settings?
Are you using any metric to check for model accuracy after QLora finetuning. I had used my custom dataset for finetuning and my inference results are not good. Model is hallucinating a lot. Do you have any BKM for fine-tuning?
Have you checked your loss curve of finetuning? Is the loss decreasing normally during the finetune process and ultimately stabilizing at a fixed value? What are the approximate train loss and eval loss in the end?
Are you also using any profiler to check for GPU memory usage? Do you have any suggestion?
I just use "GPU Memory Used" column in "sudo xpu-smi stats -d 0" to check GPU memory usage.
@rnwang04 GPU finetuning suddenly stopped working and gave Seg Fault.
![]()
are you running it inside vscode?