FasterTransformer Run and query a finetuned / custom T5 model in Triton Inference Server

Hi guys! I have two questions, can you help me?

It is possible to use triton inference server for a finetuned / custom T5 model with vocabulary with different size from the original (eg MODEL_VOC_SIZE = 32100). Do you have any guide or repository with example?
How do I send query to triton server with a finetuned / customT5 model? Do you have any guide or repository with example?

Jun 06 '22 12:06 Valterrsj

It is possible. When you convert the model successfully, the model configuration will be saved in a config file and the backend will read it, like here.
Please refer triton t5 guide.

Jun 06 '22 12:06 byshiue

@byshiue, thanks a lot for the quick response. I will check the mentioned guides in detail.

Jun 06 '22 13:06 Valterrsj

Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.

Sep 08 '22 07:09 byshiue