unsloth icon indicating copy to clipboard operation
unsloth copied to clipboard

Finetuning on Llama 3.1 instruct version throws Untrained tokens found error

Open paraschopra opened this issue 5 months ago • 1 comments

Replicating my issue from Discord here.

I'm following the provided notebook on my dataset, but it keeps throwing the following error:

Unsloth: Untrained tokens of [[128042, 128036]] found, but embed_tokens & lm_head not trainable, causing NaNs. Restart then add embed_tokens & lm_head to FastLanguageModel.get_peft_model(target_modules = [..., "embed_tokens", "lm_head",]). Are you using the base model? Instead, use the instruct version to silence this warning.

I generated data via llama only (so not sure why it is happening!)

See my notebook:

https://colab.research.google.com/drive/1YhKQk4lAhlO0rGwjlQQ6fiv680nimXXE?usp=sharing

Thanks for help!

paraschopra avatar Sep 25 '24 12:09 paraschopra