tensorrtllm_backend How is GptManager used in Triton backend?

How is GptManager used in Triton backend?

Open ekagra-ranjan opened this issue 1 year ago • 1 comments

I see that Triton backend creates an object of GptManager which gets passed the engine dir. However, I unable to see any code that shows how this GptManager is being called. All I can see is backend calling some Triton function but the GptManager is not a function arg to those calls so I am curious how is the engine being called from Triton backend.

Can I please get some pointers to the code which does this?

Thanks!

Apr 19 '24 19:04 ekagra-ranjan

gptManager is defined in this header file https://github.com/triton-inference-server/tensorrtllm_backend/blob/bf5e9007a3f16c7fc76cb156a3362d1caae198dd/inflight_batcher_llm/src/model_instance_state.h#L39, but the implementation is not opened.

Apr 26 '24 06:04 byshiue

tensorrtllm_backend tensorrtllm_backend copied to clipboard

How is GptManager used in Triton backend?

tensorrtllm_backend
tensorrtllm_backend copied to clipboard