lmdeploy
lmdeploy copied to clipboard

Published 20 hours ago •

Reame
Issues

[RFC] Refactor chat template and remove model name from engine config

Open AllentDan opened this issue 5 months ago • 5 comments

Motivation

Decoupling dialogue templates from the inference engine.
Reduce the barrier to adding new dialogue templates.
Remove model_name from EngineConfig to avoid redundant specification.
Support external dialogue templates compatible with Transformers.

Major features

The Tokenizer class supports Transformers’ Jinja dialogue templates.
Original model.get_prompt will be moved to Tokenizer.apply_chat_template
model_name is removed from TurbomindEngineConfig and PytorchEngineConfig.

How to use

For api_server, to use an extra template, commands could be:

lmdeploy serve api_server $MODEL_PATH --chat-template $JINJIA

For APIs like pipeline, we are going to provide documents to show how to add a chat template in python or Jinjia. The codes will be:

chat_template = PythonTemplate() # or a function or a Jinjia str or file path
input_inds = tokenizer.apply_chat_template(messages, chat_template=chat_template)
pipeline(input_ids)

Jan 30 '24 03:01 AllentDan