transformers
transformers copied to clipboard
Exception: expected value at line 1 column 1
System Info
transformersversion: 4.28.0.dev0- Platform: Linux-5.4.0-144-generic-x86_64-with-glibc2.31
- Python version: 3.9.16
- Huggingface_hub version: 0.13.2
- PyTorch version (GPU?): 2.0.0+cu117 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
@ArthurZucker @sgugger @gante
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [ ] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
File "/mnt1/wcp/BEELE/BELLE-main/generate_instruction.py", line 28, in tokenizer = AutoTokenizer.from_pretrained(checkpoint) File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 679, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1804, in from_pretrained return cls._from_pretrained( File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/models/bloom/tokenization_bloom_fast.py", line 118, in init super().init( File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 111, in init fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file) Exception: expected value at line 1 column 1
Expected behavior
i hope the file is run
Hey @wccccp 👋
That exception is not due to transformers, but rather due to a .json file (or similar). There is probably something fishy with your tokenizer checkpoint.
See this stack overflow issue.
嘿@wccccp 👋
该异常不是由于
transformers,而是由于.json文件(或类似文件)。您的分词器检查点可能有问题。请参阅此堆栈溢出问题。 you are right,the question is solute
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
This is Exactly What is Happening For Me:
I'm Working On My Personal Project, This Error Happens While Using The Official Tokenizer For RWKV Model using Langchain which uses rwkv pip package and tokenizer module
File "/content/Intellique/main.py", line 442, in
Does Anyone Got Solution For This. @wccccp ....
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
@TheFaheem make sure that you downloaded the actual tokenizer.json and tokenizer.model files. Cross check the size of both the files from huggingface and your local.