Huey Sun

Results 4 comments of Huey Sun

> PS: you can already replace directly in the `vocab` and the `added_vocab` (since there tokens are part of both) Hi @ArthurZucker, do you mind elaborating on this? I'm experiencing...

Sure @itazap, thank you. I am using the Mistral-7B-v0.3 tokenizer. ``` ~ from transformers import AutoTokenizer ~ tokenizer = AutoTokenizer.from_pretrained('mistralai/Mistral-7B-v0.3') ~ tokenizer.vocab['[control_8]'] 10 ~ del tokenizer.vocab['[control_8]'] ~ tokenizer.vocab['[control_8]'] 10 ```...

I'm getting this error when I try to instantiate the `PreTrainedTokenizerFast` object ``` >>> tokenizer = PreTrainedTokenizerFast.from_pretrained('mistralai/Mistral-7B-v0.3') The tokenizer class you load from this checkpoint is not the same type...

Thank you for the clarification @itazap, modifying the configs worked perfectly! After using `save_pretrain` with my `PreTrainedTokenizerFast` tokenizer, I was able to load it locally (with the proper overwritten tokens)...