FastChat Tokenizer issue with Vicuna V1.1, EOS, BOS tokens seem to be blank

Hello,

When I try and get the BOS and EOS token from the tokenizer. I'm getting '' as both EOS and BOS tokens. Tried it with both AutoTokenizer as well as LlamaTokenizer.

>>> tokenizer.eos_token
''
>>> tokenizer.bos_token
''

The documentation on HuggingFace says that the EOS token is "</s>". I further suspect that it is not the case since this is the special_tokens_map.json file

{
  "bos_token": {
    "content": "",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  },
  "eos_token": {
    "content": "",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  },
  "unk_token": {
    "content": "",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  }
}

Could Anyone tell me if they're experiencing the same and if it might be an error

Apr 13 '23 12:04 SupreethRao99

I have the following from within the code (debugging):

tokenizer.eos_token
'</s>'
tokenizer.bos_token
'<s>'

But on my system, once I asked a question, the ASSISTANT will go on forever with the conversation on his own. So I believe there is something odd with those tokenizers

Apr 13 '23 15:04 christianwengert

Could you try the following steps?

Update the huggingface transformer to the latest main branch
Redo the weight conversion following https://huggingface.co/docs/transformers/main/model_doc/llama
Apply delta with the latest FasChat

Hugging face did some changes to the llama tokenizer recently.

Apr 13 '23 18:04 merrymercy

ok I upgraded to

fastchat to latest
huggingface transformer to latest
apply delta with latest fastchat

And I still get the same problem, i.e. the assistant is doing the whole conversation between assistant and user on its own

:(

Apr 13 '23 19:04 christianwengert

What weight version did you use? V0 or V1.1? Please check their different and fschat version compatibility here https://github.com/lm-sys/FastChat/blob/main/docs/weights_version.md

Could you share your chat history so we can know what happened?

Apr 13 '23 20:04 merrymercy

I have the same issue with Vicuna V1.1

Apr 14 '23 05:04 phnessu4

I have the same issue with fastchat 0.2.1. I have tried to update huggingface transformers and restart workers, but still not work. vicuna v0 and vicuna v1.1 both have the same issue. Only when I change fschat version to 0.1.10, the problem solved. @merrymercy

Apr 14 '23 06:04 suc16

I fine with vicuna v1.0 and fastchat 0.2.1, but my model is converted on 0.1.9 And I've the same problem with v1.1, which is converted under 0.2.1

Apr 14 '23 08:04 bash99

New models and v.0.1.10 works for me

Apr 14 '23 08:04 christianwengert

New models and v.0.1.10 works for me

so

fschat-v0.1.10 + vicuna-7b-v0 work
fschat-v0.1.10 + vicuna-7b-v1.1 work
fschat-v0.2.1 + vicuna-7b-v0 not work
fschat-v0.2.1 + vicuna-7b-v1.1 not work

Apr 14 '23 09:04 suc16

I guess the blank EOS/BOS is not only related to fastchat or Vicuna weights but it is also related to how you convert the base llama model.

I suggest you use transformers>=4.28.0 and redo the weight conversion. In either v0 or v1.1, you should get a file named " special_tokens_map.json" in your converted weight, with the same content as this file https://huggingface.co/lmsys/vicuna-13b-delta-v0/blob/main/special_tokens_map.json. If not, please copy special_tokens_map.json and tokenizer_config.json from https://huggingface.co/lmsys/vicuna-13b-delta-v0/tree/main to your converted weight folder (works for both v0 and v1.1)

In terms of the combability, The v1.1 weights work best for fschat>=0.2.1, but also work for older fschat. The v0 weights work best for fschat=0.1.10, but do not work for newer fschat, except you explicitly state the conversation template by using this https://github.com/lm-sys/FastChat/blob/898d4fcf94feff9aa5bf792f0e135b6fecb7cf38/fastchat/serve/inference.py#L30

Apr 17 '23 02:04 merrymercy

redownload the models and do the transformer again will fix this.

Apr 17 '23 02:04 phnessu4

Thanks. My environment python3.9 transformers 4.28.1 fschat 0.2.2

After applying delta with latest fastchat, I still get the blank EOS/BOS in special_tokens_map.json python3 -m fastchat.model.apply_delta --base /data/models/llama-7b-hf --target /data/models/vicuna-7b --delta /data/models/vicuna-7b-delta-v1.1

The problem solved, after copying special_tokens_map.json and tokenizer_config.json

Package Version

accelerate 0.18.0 aiofiles 23.1.0 aiohttp 3.8.4 aiosignal 1.3.1 altair 4.2.2 anyio 3.6.2 appdirs 1.4.4 async-timeout 4.0.2 attrs 22.2.0 certifi 2022.12.7 charset-normalizer 3.1.0 click 8.1.3 cmake 3.26.3 contourpy 1.0.7 cycler 0.11.0 docker-pycreds 0.4.0 entrypoints 0.4 fastapi 0.95.0 ffmpy 0.3.0 filelock 3.11.0 fonttools 4.39.3 frozenlist 1.3.3 fschat 0.2.2 fsspec 2023.4.0 gitdb 4.0.10 GitPython 3.1.31 gradio 3.23.0 h11 0.14.0 httpcore 0.17.0 httpx 0.24.0 huggingface-hub 0.13.4 idna 3.4 importlib-resources 5.12.0 Jinja2 3.1.2 jsonschema 4.17.3 kiwisolver 1.4.4 linkify-it-py 2.0.0 lit 16.0.1 markdown-it-py 2.2.0 markdown2 2.4.8 MarkupSafe 2.1.2 matplotlib 3.7.1 mdit-py-plugins 0.3.3 mdurl 0.1.2 mpmath 1.3.0 multidict 6.0.4 networkx 3.1 numpy 1.24.2 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 orjson 3.8.10 packaging 23.0 pandas 2.0.0 pathtools 0.1.2 Pillow 9.5.0 pip 23.0.1 prompt-toolkit 3.0.38 protobuf 4.22.1 psutil 5.9.4 pydantic 1.10.7 pydub 0.25.1 Pygments 2.15.0 pyparsing 3.0.9 pyrsistent 0.19.3 python-dateutil 2.8.2 python-multipart 0.0.6 pytz 2023.3 PyYAML 6.0 regex 2023.3.23 requests 2.28.2 rich 13.3.3 semantic-version 2.10.0 sentencepiece 0.1.97 sentry-sdk 1.19.1 setproctitle 1.3.2 setuptools 65.6.3 shortuuid 1.0.11 six 1.16.0 smmap 5.0.0 sniffio 1.3.0 starlette 0.26.1 svgwrite 1.4.3 sympy 1.11.1 tokenizers 0.13.3 toolz 0.12.0 torch 2.0.0 tqdm 4.65.0 transformers 4.28.1 triton 2.0.0 typing_extensions 4.5.0 tzdata 2023.3 uc-micro-py 1.0.1 urllib3 1.26.15 uvicorn 0.21.1 wandb 0.14.2 wavedrom 2.0.3.post3 wcwidth 0.2.6 websockets 11.0.1 wheel 0.38.4 yarl 1.8.2 zipp 3.15.0

@merrymercy

Apr 17 '23 02:04 suc16

Thanks everyone, converting the LLaMA weights using the new converter from hugging face + applying the Vicuna v1.1 delta worked out of the box.

Apr 19 '23 02:04 SupreethRao99