xgrammar
xgrammar copied to clipboard
Specify specific token_id in regex or grammar
For special tokens (such as </think> for reasoning models), is there a way to explicitly request to match the token ID for the special token, rather than hope the combination of the matcher, tokenizer, etc. end up on the right ID? An of example of a potential issue is the </think> string in a regex possibly getting broken up into multiple token IDs, which will have a different meaning to the model.
Thanks for the feature request! This is planned in later versions. Please stay tuned!