What does this PR do?

Right now the convert_tekken_tokenizer does not add bos_tokens, eos_token to the special tokens via the add_special_tokens method.

This prevents the chat templates that expect eos_token and bos_token to work properly.

Previously this was working as when saving the tokenizer a special_tokens_map.json was created which is no longer the case. Unknown to me why but I'd assume this is due to the V5 refactoring ?

This PR fixes that by adding explicitly these tokens to the tokenizer and when saving they're now stored in tokenizer_config.json.

Before submitting

[ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[ ] Did you read the contributor guideline, Pull Request section?
[ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
[ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
[ ] Did you write any new necessary tests?

Who can review?

@ArthurZucker

Dec 03 '25 11:12 juliendenize

cc @itazap

Dec 05 '25 14:12 Rocketknight1

run-slow: ministral3, mistral3

Dec 08 '25 15:12 itazap

This comment contains run-slow, running the specified jobs:

models: ["models/ministral3", "models/mistral3"] quantizations: []

Dec 08 '25 15:12 github-actions[bot]

CI Results

Workflow Run ⚙️

✅ No failing test specific to this PR 🎉 !

Dec 08 '25 15:12 github-actions[bot]

Hey! Thanks for the PR, can you please share a short reproducer of the problem (you mentioned in chat templates)? perhaps we'll need to add a test !

Dec 08 '25 16:12 itazap

[For maintainers] Suggested jobs to run (before merge)

run-slow: ministral3, mistral3

Dec 09 '25 16:12 github-actions[bot]

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Dec 09 '25 17:12 HuggingFaceDocBuilderDev

Fix convert_tekken_tokenizer

What does this PR do?

Before submitting

Who can review?

CI Results