ipex-llm issue with qlora fine-tuning on Flex GPU

Hi,

I am trying to use the Qlora code as provided in the repo on a Sapphire Rapids, Flex GPU machine.

I was able to run the qlora_finetuning.py without any error.

But the export_merged_model.py is giving me this error:

The command I used to merge the model: python ./export_merged_model.py --repo-id-or-model-path < path to llama-2-7b-chat-hf> --adapter_path ./outputs/checkpoint-200 --output_path ./outputs/checkpoint-200-merged

OS : Ubuntu 22 This is my training info:

Oct 30 '23 23:10 tsantra

Hi, @tsantra Would you mind trying it again after pip install accelerate==0.23.0 ?

Oct 31 '23 02:10 rnwang04

@rnwang04 Thank you. It worked after installing accelerate=0.23.0

I have two questions:

Is QLora fine-tuning supported on CPU?
The code here https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/QLoRA-FineTuning/export_merged_model.py , shows device_map={"": cpu}, so which part of the code is running on the Flex GPU?

Oct 31 '23 22:10 tsantra

Hi @tsantra ,

Yes, it's supported on CPU, we will provide an official CPU example later.
After you got the merged model (for example checkpoint-200-merged), you can use it as a normal huggingface transformer model to do inference on Flex GPU, like https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2

Nov 01 '23 01:11 rnwang04

Hi @tsantra , QLoRA CPU example is updated here(https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/QLoRA-FineTuning)

Nov 02 '23 01:11 rnwang04

Hi @rnwang04 , thank you for your reply!

Are you using any metric to check for model accuracy after QLora finetuning. I had used my custom dataset for finetuning and my inference results are not good. Model is hallucinating a lot. Do you have any BKM for fine-tuning?

Are you also using any profiler to check for GPU memory usage? Do you have any suggestion?

Nov 03 '23 22:11 tsantra

Had closed by mistake.

Nov 03 '23 22:11 tsantra

@rnwang04 GPU finetuning suddenly stopped working and gave Seg Fault.

Nov 06 '23 00:11 tsantra

@rnwang04 GPU finetuning suddenly stopped working and gave Seg Fault.

Hi @tsantra , have you ever run GPU finetuning successfully ? or you always meet this error? If you ever run GPU finetuning successfully before, do you make any changes to your script or env settings?

Nov 06 '23 01:11 rnwang04

Are you using any metric to check for model accuracy after QLora finetuning. I had used my custom dataset for finetuning and my inference results are not good. Model is hallucinating a lot. Do you have any BKM for fine-tuning?

Have you checked your loss curve of finetuning? Is the loss decreasing normally during the finetune process and ultimately stabilizing at a fixed value? What are the approximate train loss and eval loss in the end?

Are you also using any profiler to check for GPU memory usage? Do you have any suggestion?

I just use "GPU Memory Used" column in "sudo xpu-smi stats -d 0" to check GPU memory usage.

Nov 06 '23 06:11 rnwang04

@rnwang04 GPU finetuning suddenly stopped working and gave Seg Fault.

are you running it inside vscode?

Jan 22 '24 12:01 shane-huang

ipex-llm ipex-llm copied to clipboard

issue with qlora fine-tuning on Flex GPU

ipex-llm
ipex-llm copied to clipboard