transformers
transformers copied to clipboard
SPLIT PR: eos bos tokens
Fix for 2 issues:
add_bos_token&add_eos_tokenflags ignored forPreTrainedTokenizerFast: issue discussed here and hereadd_special_tokensdoes not updatebos_tokenoreos_token- ex.add_special_tokens({'bos_token': '<new_bos>'})
TASKS:
- [x] added an
update_post_processorfunction inPreTrainedTokenizerFastbased on llamatokenizer, allows reading of bos / eos token flag
**SUPPORTS FAST ONLY
slow required updating kwargs to be passed into sp_model , so that bos/eos tokens can be added accordingly..
Reviewer: @ArthurZucker
NOTE: hub token seems to not have access to llama 3, should pass after addressed
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.