server
server copied to clipboard
Provide native support for server-side tokenization
Hi team,
Currently the working pattern for server side tokenization is for users to write a model.py with python backend to perform tokenization, which is great for flexibility and customization.
While given the rise of language models and popularity of some common model / tokenizer architecture, I'm wondering if you plan to provide tokenizer support natively so users can configure tokenizer just through tokenizer artifacts and config.pbtxt
Hi @WilliamOnVoyage, I believe both the vLLM and TensorRT-LLM backends handle tokenization internally without user-code-changes required, and are configurable through their respective config files or based on the model being used. Does this satisfy your needs?