FasterTransformer icon indicating copy to clipboard operation
FasterTransformer copied to clipboard

Run and query a finetuned / custom T5 model in Triton Inference Server

Open Valterrsj opened this issue 3 years ago • 2 comments

Hi guys! I have two questions, can you help me?

  1. It is possible to use triton inference server for a finetuned / custom T5 model with vocabulary with different size from the original (eg MODEL_VOC_SIZE = 32100). Do you have any guide or repository with example?

  2. How do I send query to triton server with a finetuned / customT5 model? Do you have any guide or repository with example?

Valterrsj avatar Jun 06 '22 12:06 Valterrsj

  1. It is possible. When you convert the model successfully, the model configuration will be saved in a config file and the backend will read it, like here.
  2. Please refer triton t5 guide.

byshiue avatar Jun 06 '22 12:06 byshiue

@byshiue, thanks a lot for the quick response. I will check the mentioned guides in detail.

Valterrsj avatar Jun 06 '22 13:06 Valterrsj

Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.

byshiue avatar Sep 08 '22 07:09 byshiue