lmdeploy icon indicating copy to clipboard operation
lmdeploy copied to clipboard

[RFC] Refactor chat template and remove model name from engine config

Open AllentDan opened this issue 5 months ago • 5 comments

Motivation

  • Decoupling dialogue templates from the inference engine.
  • Reduce the barrier to adding new dialogue templates.
  • Remove model_name from EngineConfig to avoid redundant specification.
  • Support external dialogue templates compatible with Transformers.

Major features

  • The Tokenizer class supports Transformers’ Jinja dialogue templates.
  • Original model.get_prompt will be moved to Tokenizer.apply_chat_template
  • model_name is removed from TurbomindEngineConfig and PytorchEngineConfig.

How to use

For api_server, to use an extra template, commands could be:

lmdeploy serve api_server $MODEL_PATH --chat-template $JINJIA

For APIs like pipeline, we are going to provide documents to show how to add a chat template in python or Jinjia. The codes will be:

chat_template = PythonTemplate() # or a function or a Jinjia str or file path
input_inds = tokenizer.apply_chat_template(messages, chat_template=chat_template)
pipeline(input_ids)

AllentDan avatar Jan 30 '24 03:01 AllentDan