server icon indicating copy to clipboard operation
server copied to clipboard

Server not ready: Warmup using python BLS

Open delonleonard opened this issue 3 years ago • 2 comments

Hi there,

I am serving multiple models, with one python BLS while another a onnx model. My python BLS pieces the business logic including calling the onnx model. I would like to have warmup for my BLS but when I implemented warmup in the model config, it will run the BLS warmup without waiting from the onnx model to load first. Would you please advise ways to resolve this issue. Thanks.

delonleonard avatar Jun 27 '22 09:06 delonleonard

@Tabrizian @GuanLuo any thoughts on this? Is it possible to set some kind of dependency through any built-in means before warmup?

Maybe in BLS/python model initialization, do a one-time wait until dependent model indicates ready status through REST API or Triton API.

rmccorm4 avatar Jun 29 '22 15:06 rmccorm4

I think detecting dependencies automatically in BLS can be complicated. How about adding a feature in Triton named lazy model loading such that Triton will try to automatically load the model if it is not already loaded and the model exists? I believe there was a similar use-case for this too.

Tabrizian avatar Jul 07 '22 02:07 Tabrizian

Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this issue

jbkyang-nvi avatar Nov 22 '22 03:11 jbkyang-nvi