server
server copied to clipboard
Sending two "load" requests to server makes it load twice
Description
When I use two clients to send /v2/repository/models/MODEL/load requests to the same server at the same time, the model is loaded twice
Triton Information What version of Triton are you using? 23.11
Are you using the Triton container or did you build it yourself? Container nvcr.io/nvidia/tritonserver:23.11-py3
To Reproduce Start a server in explicit mode, and load no model.
Open two terminals, run curl -X POST "http://localhost:8000/v2/repository/models/MODEL/load" -d "{}" at the same time. You can see logs like
successfully loaded MODEL
loading: MODEL
successfully loaded MODEL
successfully unloaded MODEL
Expected behavior
The model should be only loaded once. And the log successfully unloaded MODEL should be before successfully loaded MODEL
Hi @ShuaiShao93 , thanks a lot for reaching out. Can you provide with the following details
- What type of model/backend?
- Can you reproduce this behavior with other types of models/backends? Or is it specific to this one?
- Not sure how are you getting the unloaded log? Are you making a unload request?
I am unable to reproduce this
When I try to load a model simultaneously it just gets loaded once.
Hi @ShuaiShao93 , thanks a lot for reaching out. Can you provide with the following details
- What type of model/backend?
Ensemble pipeline with Python & ONNX backends
- Can you reproduce this behavior with other types of models/backends? Or is it specific to this one?
Sorry didn't get a chance to test more
- Not sure how are you getting the unloaded log? Are you making a unload request?
No, I just made load requests simultaneously from two clients, and I saw the unloaded logs
I am unable to reproduce this
When I try to load a model simultaneously it just gets loaded once.
@ShuaiShao93 I guess this is an expected behavior in the case of explicit control. If you want to validate if that particular model is loaded before sending the load request, you can always hit the /index endpoint to get the loaded models list.