server
server copied to clipboard
Server not ready: Warmup using python BLS
Hi there,
I am serving multiple models, with one python BLS while another a onnx model. My python BLS pieces the business logic including calling the onnx model. I would like to have warmup for my BLS but when I implemented warmup in the model config, it will run the BLS warmup without waiting from the onnx model to load first. Would you please advise ways to resolve this issue. Thanks.
@Tabrizian @GuanLuo any thoughts on this? Is it possible to set some kind of dependency through any built-in means before warmup?
Maybe in BLS/python model initialization, do a one-time wait until dependent model indicates ready status through REST API or Triton API.
I think detecting dependencies automatically in BLS can be complicated. How about adding a feature in Triton named lazy model loading such that Triton will try to automatically load the model if it is not already loaded and the model exists? I believe there was a similar use-case for this too.
Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this issue