NeMo
NeMo copied to clipboard
Inference discrepancies after merging weights into a LoRA model
Describe the bug
We noticed that there were some discrepancies in the inference results between the model with loaded LoRA adapter and the model with merged LoRA weights.
Steps/Code to reproduce bug
- Fine-tune a LoRA Model with train_gpt_sft.py in NeMo-aligner; (in our case, we used the Mistral-7b as the base model)
- Run inference with LoRA adapter loaded on-the-fly using megatron_gpt_generate.py
- Merge the LoRA weights using merge.py
- Run inference using the merged weight model using the same
megatron_gpt_generate
script (while skipping loading LoRA ckpt)
During this process, we found:
- The inference results from step 2 and step 4 are different by a lot
- In the LoRA weight merge script, the validation inference results were different from both step 2 and step 4
(sample of model responses were pasted in Additional context session)
Expected behavior
The inference results from merged weight model and model with LoRA adapter should be the same with the same inference config:
inference:
greedy: True
add_BOS: True
tokens_to_generate: 1024
all_probs: False
repetition_penalty: 1.2
min_tokens_to_generate: 0
compute_logprob: False
end_strings: ["<|endoftext|>"]
Environment overview (please complete the following information)
- Environment location: Docker
- Method of NeMo install: install from source
Environment details
If NVIDIA docker image is used you don't need to specify these.
- Nvidia PyTorch Docker: https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-24-01.html
- NeMo version: https://github.com/NVIDIA/NeMo/commit/9b64e390b534d4eb5ad7f28502bcfe4c7f0c6c39
- NeMo-Aligner versino: https://github.com/NVIDIA/NeMo-Aligner/commit/ea78731d9fd86e822b0253fca8a10e0e8a4526c9
Additional context
Samples of the inference results
- Inference result from Step 2 (Base Mistral + LoRA Adapter loaded on the fly)
# Prompt: Compose an engaging travel blog post about a recent trip to Hawaii, highlighting cultural experiences and must-see attractions.
Include photos and thoughtful captions
that convey the beauty and aloha spirit of the islands.\n\nAloha friends,\n\nI recently had the pleasure of visiting the beautiful islands of Hawaii for the first time. As a travel blogger, I was eager to experience the rich history and culture of this trop
ical paradise. My trip was filled with once-in-a-lifetime experiences
...
- Inference result from Step 4 (LoRA model with merged weights)
# Prompt: Compose an engaging travel blog post about a recent trip to Hawaii, highlighting cultural experiences and must-see attractions.
Here is a draft with key points to share a fun-filled vacation:\n\nAloha!
I recently returned from a fun-filled trip to the big island of Hawaii.
I was so blessed to have a family member visit and Hawaiian cultural immersion
while there.
...
- Inference result in Step 3 in the merge.py
# Prompt: Compose an engaging travel blog post about a recent trip to Hawaii, highlighting cultural experiences and must-see attractions.
Here is a draft:\n\nAloha! I recently returned from a dream vacation in the beautiful islands of Hawaii.
I had such a rejuvenating and eye-opening experience there. This laid-back and peaceful paradise really is a special place.\n\nI visited both Big Island and Oahu, and it was such a treat to see the lava flows and waterfalls, and the lush green volcanic crater on Kilauea
...
@mark-myzhao Thanks we will dig further. In the mean time can you try training lora weights using https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/language_modeling/tuning/megatron_gpt_finetuning.py
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
I met the same issue. and also I found that merge the lora on 3090 is different from A100, by different I mean the two merged models have different MD5.
Can I ask a naive question, what is the temperature are you using to generate the ansewers? I would like understand if it would always generate the same response or you are simulation creativity.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been inactive for 7 days since being marked as stale.
Hi @mark-myzhao I'm facing the same issue, did you find a solution?