server icon indicating copy to clipboard operation
server copied to clipboard

Provide native support for server-side tokenization

Open WilliamOnVoyage opened this issue 1 year ago • 1 comments

Hi team,

Currently the working pattern for server side tokenization is for users to write a model.py with python backend to perform tokenization, which is great for flexibility and customization.

While given the rise of language models and popularity of some common model / tokenizer architecture, I'm wondering if you plan to provide tokenizer support natively so users can configure tokenizer just through tokenizer artifacts and config.pbtxt

WilliamOnVoyage avatar Jul 24 '24 17:07 WilliamOnVoyage

Hi @WilliamOnVoyage, I believe both the vLLM and TensorRT-LLM backends handle tokenization internally without user-code-changes required, and are configurable through their respective config files or based on the model being used. Does this satisfy your needs?

rmccorm4 avatar Jul 31 '24 22:07 rmccorm4