Tokenizer issue with Vicuna V1.1, EOS, BOS tokens seem to be blank
Hello,
When I try and get the BOS and EOS token from the tokenizer. I'm getting '' as both EOS and BOS tokens. Tried it with both AutoTokenizer as well as LlamaTokenizer.
>>> tokenizer.eos_token
''
>>> tokenizer.bos_token
''
The documentation on HuggingFace says that the EOS token is "</s>". I further suspect that it is not the case since this is the special_tokens_map.json file
{
"bos_token": {
"content": "",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
}
}
Could Anyone tell me if they're experiencing the same and if it might be an error
I have the following from within the code (debugging):
tokenizer.eos_token
'</s>'
tokenizer.bos_token
'<s>'
But on my system, once I asked a question, the ASSISTANT will go on forever with the conversation on his own. So I believe there is something odd with those tokenizers
Could you try the following steps?
- Update the huggingface transformer to the latest main branch
- Redo the weight conversion following https://huggingface.co/docs/transformers/main/model_doc/llama
- Apply delta with the latest FasChat
Hugging face did some changes to the llama tokenizer recently.
ok I upgraded to
- fastchat to latest
- huggingface transformer to latest
- apply delta with latest fastchat
And I still get the same problem, i.e. the assistant is doing the whole conversation between assistant and user on its own
:(
What weight version did you use? V0 or V1.1? Please check their different and fschat version compatibility here https://github.com/lm-sys/FastChat/blob/main/docs/weights_version.md
Could you share your chat history so we can know what happened?
I have the same issue with Vicuna V1.1
I have the same issue with fastchat 0.2.1. I have tried to update huggingface transformers and restart workers, but still not work. vicuna v0 and vicuna v1.1 both have the same issue. Only when I change fschat version to 0.1.10, the problem solved. @merrymercy
I fine with vicuna v1.0 and fastchat 0.2.1, but my model is converted on 0.1.9 And I've the same problem with v1.1, which is converted under 0.2.1
New models and v.0.1.10 works for me
New models and v.0.1.10 works for me
so
- fschat-v0.1.10 + vicuna-7b-v0 work
- fschat-v0.1.10 + vicuna-7b-v1.1 work
- fschat-v0.2.1 + vicuna-7b-v0 not work
- fschat-v0.2.1 + vicuna-7b-v1.1 not work
I guess the blank EOS/BOS is not only related to fastchat or Vicuna weights but it is also related to how you convert the base llama model.
I suggest you use transformers>=4.28.0 and redo the weight conversion. In either v0 or v1.1, you should get a file named "
special_tokens_map.json" in your converted weight, with the same content as this file https://huggingface.co/lmsys/vicuna-13b-delta-v0/blob/main/special_tokens_map.json. If not, please copy special_tokens_map.json and tokenizer_config.json from https://huggingface.co/lmsys/vicuna-13b-delta-v0/tree/main to your converted weight folder (works for both v0 and v1.1)
In terms of the combability,
The v1.1 weights work best for fschat>=0.2.1, but also work for older fschat.
The v0 weights work best for fschat=0.1.10, but do not work for newer fschat, except you explicitly state the conversation template by using this
https://github.com/lm-sys/FastChat/blob/898d4fcf94feff9aa5bf792f0e135b6fecb7cf38/fastchat/serve/inference.py#L30
redownload the models and do the transformer again will fix this.
Thanks. My environment python3.9 transformers 4.28.1 fschat 0.2.2
After applying delta with latest fastchat, I still get the blank EOS/BOS in special_tokens_map.json python3 -m fastchat.model.apply_delta --base /data/models/llama-7b-hf --target /data/models/vicuna-7b --delta /data/models/vicuna-7b-delta-v1.1
The problem solved, after copying special_tokens_map.json and tokenizer_config.json
Package Version
accelerate 0.18.0 aiofiles 23.1.0 aiohttp 3.8.4 aiosignal 1.3.1 altair 4.2.2 anyio 3.6.2 appdirs 1.4.4 async-timeout 4.0.2 attrs 22.2.0 certifi 2022.12.7 charset-normalizer 3.1.0 click 8.1.3 cmake 3.26.3 contourpy 1.0.7 cycler 0.11.0 docker-pycreds 0.4.0 entrypoints 0.4 fastapi 0.95.0 ffmpy 0.3.0 filelock 3.11.0 fonttools 4.39.3 frozenlist 1.3.3 fschat 0.2.2 fsspec 2023.4.0 gitdb 4.0.10 GitPython 3.1.31 gradio 3.23.0 h11 0.14.0 httpcore 0.17.0 httpx 0.24.0 huggingface-hub 0.13.4 idna 3.4 importlib-resources 5.12.0 Jinja2 3.1.2 jsonschema 4.17.3 kiwisolver 1.4.4 linkify-it-py 2.0.0 lit 16.0.1 markdown-it-py 2.2.0 markdown2 2.4.8 MarkupSafe 2.1.2 matplotlib 3.7.1 mdit-py-plugins 0.3.3 mdurl 0.1.2 mpmath 1.3.0 multidict 6.0.4 networkx 3.1 numpy 1.24.2 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 orjson 3.8.10 packaging 23.0 pandas 2.0.0 pathtools 0.1.2 Pillow 9.5.0 pip 23.0.1 prompt-toolkit 3.0.38 protobuf 4.22.1 psutil 5.9.4 pydantic 1.10.7 pydub 0.25.1 Pygments 2.15.0 pyparsing 3.0.9 pyrsistent 0.19.3 python-dateutil 2.8.2 python-multipart 0.0.6 pytz 2023.3 PyYAML 6.0 regex 2023.3.23 requests 2.28.2 rich 13.3.3 semantic-version 2.10.0 sentencepiece 0.1.97 sentry-sdk 1.19.1 setproctitle 1.3.2 setuptools 65.6.3 shortuuid 1.0.11 six 1.16.0 smmap 5.0.0 sniffio 1.3.0 starlette 0.26.1 svgwrite 1.4.3 sympy 1.11.1 tokenizers 0.13.3 toolz 0.12.0 torch 2.0.0 tqdm 4.65.0 transformers 4.28.1 triton 2.0.0 typing_extensions 4.5.0 tzdata 2023.3 uc-micro-py 1.0.1 urllib3 1.26.15 uvicorn 0.21.1 wandb 0.14.2 wavedrom 2.0.3.post3 wcwidth 0.2.6 websockets 11.0.1 wheel 0.38.4 yarl 1.8.2 zipp 3.15.0
@merrymercy
Thanks everyone, converting the LLaMA weights using the new converter from hugging face + applying the Vicuna v1.1 delta worked out of the box.