unsloth icon indicating copy to clipboard operation
unsloth copied to clipboard

qlora taining on qwen1.5-15b-chat

Open wsp317 opened this issue 1 year ago • 3 comments

训练qwen1.5-14b-chat,遇到下面的报错,transformers==4.38.2

RuntimeError( "Unsloth: Tokenizer's pad_token cannot be = eos_token, and we couldn't find a\n"
"replacement of either <|reserved... or <|placeholder..." )

wsp317 avatar May 13 '24 09:05 wsp317

Oh that is an issue - the pad_token must be not the same as the eos_token, otherwise the finetune will be incorrect. I'll see if I can extend the tokenizer itself

danielhanchen avatar May 13 '24 10:05 danielhanchen

I change the pad_token from <|endoftext|> to <|im_end|> in qwen's tokenizer_config.json file, and the training seems work.

wsp317 avatar May 13 '24 10:05 wsp317

@wsp317 I fixed it just then! Sorry on the delay! I

If you're on a local machine, please update Unsloth via

pip uninstall unsloth -y
pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git

Colab and Kaggle is fine (just restart it)

danielhanchen avatar May 16 '24 04:05 danielhanchen