[BUG/Help] tokenizer collapse?

Open CoinCheung opened this issue 2 years ago • 0 comments

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

tokenizer collapse

Expected Behavior

No response

Steps To Reproduce

model_name = 'THUDM/chatglm3-6b-base' config = AutoConfig.from_pretrained(model_name, trust_remote_code=True, use_fast=False) tokenizer = AutoTokenizer.from_pretrained(model_name, config=config, trust_remote_code=True) print([tokenizer.decode(el) for el in [1833, 2893]])

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

On my platform, the output is same:

Jan 12 '24 04:01 CoinCheung