unsloth
unsloth copied to clipboard
NEFT Tune: Gibberish results
Hello,
I used NEFT tune alpha noise in the training loop to avoid overfitting. But, after training it for some steps after reaching 0.8 loss I tried asking a question and it gave me gibberish results. Here is the screenshot of the result.
Have you tried to reload the model from the checkpoint? It happened to me as well even without NEFT Tuning, if I just stopped training and infer the same model without reloading.
It seems to me that it's a tokenizer error, but I agree it might be an issue, since I'm not sure if this affects training as well.
Apologies I can confirm there are some tokenization issues! Working on a fix
@AliHaider20 Unsure if my temporary fix will fix this - use pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git
to update. Colab / Kaggle no need to update
@AliHaider20 I think I just fixed inference (hopefully!)
If it still doesn't work, I'll have to check NEFTune separately
Thanks for looking into the issue. Sorry for the late response.
I trained a complete new Mistral model, and this time, it doesn't give long gibberish, but still, the answers aren't making any sense. Sometimes, the answers aren't even there (no generation). Just the end of sequence is printed.
Is this issue resolved?
Oh tbh I'm not certain - I haven't tried it, but I'm assuming it's resolved?
Hi @danielhanchen, I'm not sure this is resolved
I was also getting gibberish results when finetuning Llama 3.1 8B instruct on long context samples using unsloth, I was using neftune_noise_alpha=5
and looks like the issue was resolved as soon as I stopped using neftune
A wild guess: Since the loss seemed okay-ish and on par with training without neftune and/or using transformers w/o unstloth, my guess is that neftune stays enabled even when doing inference with FastLanguageModel.for_inference(model)
.
I didn't try to reload the model and use it without neftune after having trained it with neftune to see if this is the problem
I didn't try to reload the model and use it without neftune after having trained it with neftune to see if this is the problem
Update on this. After all, looks like trainer hooks (including NEFTune) affect the model even if they're a part of trainer
, so they should be disabled/removed from the trainer when doing inference. I think this is what unsloth is missing to do.
I finetuned my model with neftune but it was outputting gibberish, but just calling to:
trainer.neftune_hook_handle.remove()
fixed it. No reloading of the model was needed.
So, I guess we'd want this to be done when doing FastLanguageModel.for_inference(model)
. What I'm not sure is how to re-register it to be able to do FastLanguageModel.for_training(model)
and keep neftune.
Actually I JUST noticed NEFTune was NEVER enabled during training, and during inference, it gets enabled, hence the gibberish.
As @ivsanro1 suggested, I tried doing it, but I found HF isn't interacting well. So I manually added a forward hook, and also disable it during FastLanguageModel.for_inference
, and also it gets re-enabled during FastLanguageModel.for_training
.
Please update Unsloth via:
pip uninstall unsloth -y
pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
Thanks @danielhanchen. Is it getting fixed? Also, I was using it on Kaggle, hope so the update will be on Kaggle as well.
@AliHaider20 Kaggle is also fixed!