[Feature] tokenizer_manager accept external tokenizer or skip tokenizer init
Motivation
Currently tokenizer_manager only support init tokenizer by transformers utils. But many models are trained with different toeknizer model (e.g minicpm, tiktoken). On another hand, GenerateReqInput supported 'input_ids' which is already tokenized, that means the tokenizer is not needed for generate request.
btw, vllm support skip_tokenizer_init, please consider similar settings for flexibility.
Thanks!
Related resources
No response
This should be easy to support. Could you give us a specific example or model name that we can run the test on? If the tokenizer is skipped, does it mean the server will accept input_ids and return output_ids without detokenization?
This should be easy to support. Could you give us a specific example or model name that we can run the test on?
maybe you can try on https://huggingface.co/openbmb/cpm-bee-2b, my tokenizer is similar to it which is an old-style vocab.txt list, which will fail transformers autotokenizer init. or https://huggingface.co/meta-llama/Meta-Llama-3-8B, which used tiktoken tokenizer (not sure if already integrated with transformers though).
If the tokenizer is skipped, does it mean the server will accept input_ids and return output_ids without detokenization?
yes. i can run encode/decode process those ids.
Sounds good. We will look into this later. If you have bandwidth, contributions are welcome!
Sounds good. We will look into this later. If you have bandwidth, contributions are welcome!
have tried filing https://github.com/sgl-project/sglang/pull/959, could you help to take a look?