serve icon indicating copy to clipboard operation
serve copied to clipboard

When will a new worker be started by Torchserver ?

Open NormXU opened this issue 2 years ago • 2 comments

[Question]

As here mentioned:

max_worker is the parameter that TorchServe will make no more than this number of workers for the specified model.

Does that mean that TorchServe will automatically start a new worker for the registered model during the inference stage when there still exists enough GPU memory?

Suppose we have configuration like below

models={\
  "network": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "network.mar",\
        "minWorkers": 1,\
        "maxWorkers": 4,\
        "batchSize": 1,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  }\
}

I monitored my server and noticed that torchserver always started 1 worker for network.mar even if there was enough GPU memory to start 4 workers.

Thanks for your help and explanation in advance

NormXU avatar May 24 '22 08:05 NormXU

Hi @NormXU just to clarify if you set minWorkers=4 or some number larger than 1 then you get the behavior you expect which is a larger memory allocation on GPU? But when it's equal to 1 you're observing that maxWorkers has no impact?

msaroufim avatar May 26 '22 05:05 msaroufim

@msaroufim Exactly. I expect the the torchserver can automatically start or kill workers according to the left GPU memory and maxWorkers is the largest number of workers a handler can start. However, my experiments showed that it might not work in this way

NormXU avatar May 26 '22 07:05 NormXU