FasterTransformer Serve Deberta using FasterTransformer in Triton

Serve Deberta using FasterTransformer in Triton

Open sfc-gh-zhwang opened this issue 1 year ago • 1 comments

Hi, Is there any tutorial that we can refer to so that we could serve a deberta model using fastertransformer in Triton? I think the steps would be:

However, I only see the step 1 with a tensorflow example.

Jun 28 '23 03:06 sfc-gh-zhwang

https://github.com/NVIDIA/FasterTransformer/pull/725

Jul 19 '23 07:07 sfc-gh-zhwang