sagemaker-inference-toolkit
sagemaker-inference-toolkit copied to clipboard
Server Timeout Unit is Minutes in MMS, but docstring says Seconds
Describe the bug
The model server timeout ("used for model server's backend workers before they are deemed unresponsive and rebooted") currently set in with env vars using SAGEMAKER_MODEL_SERVER_TIMEOUT
is listed in seconds in the property method description docstring...
def model_server_timeout(self): # type: () -> int
"""int: Timeout, in **seconds**, used for model server's backend workers before
they are deemed unresponsive and rebooted.
"""
return self._model_server_timeout
...but the actual unit used downstream in multi-model server worker manager is minutes, not seconds.
// TODO: Change this to configurable param
ModelWorkerResponse reply = replies.poll(responseTimeout, TimeUnit.MINUTES);
Because of this, the default timeout of 20 in inference toolkit is actually a 20 minute timeout, not a 20 second timeout.
It seems odd that the unit is minutes, and because this is a parsed as an int in inference-toolkit argparse it does only give a resolution of whole minutes (instead of say, .33 minutes for a 20s equivalent timeout), so should I report this downstream in multi-model-server? If you don't want to change it, we should at least fix the docstring in inference-toolkit.