Arthur
Arthur
Let's close it 😉
Hey all! sorry I'll have a look!
I am pretty sure that `32106: AddedToken(" ", rstrip=False, lstrip=False, single_word=False, normalized=True, special=False),` is an issue: ```python tokenizer.encode("hey .") ``` will produce this issue
If I do `AutoTokenizer.from_pretrained("path-to-model", added_tokens_decoder=None)` then this is no longer the case
Re-opening as the merge on main will be reverted for a better fix soon
Hey! It does not seem to be asked that much unfortunately and would be a loooot of efforts on our side. You do have unofficial C bindings out there I...
This was fixed in `transformers` you need to set `legacy=False` 🤗
Thanks for your contribution 🤗
This issue is more a feature request than a `problem`. You are doing something wrong as the error indicates: pretty sure the special tokens are missing in the `tokenizer` while...
Hey! Pretty sure it is available in `peft` see this [notebook](https://github.com/huggingface/peft/blob/main/examples/int8_training/peft_bnb_whisper_large_v2_training.ipynb) and this [discussion](https://github.com/openai/whisper/discussions/988)