sagemaker-python-sdk
sagemaker-python-sdk copied to clipboard
Multi-model endpoint workers die when sklearn entrypoint imports package installed with requirements.txt
Multi-model endpoint workers die when the entry point imports a package installed through requirements.txt. The package is installed successfully and the endpoint is created successfully, but inference requests always fail.
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "{
"code": 500,
"type": "InternalServerException",
"message": "Worker died."
}
To reproduce
Include a requirements.txt in the source_dir and import the installed package in the entry point script or the model_fn.
https://gist.github.com/gavinmh/267bc34ddedaf0931151a901859e165d changes the sklearn_multi_model_endpoint_home_value.ipynb example notebook.
In particular, it adds:
%%writefile $SOURCE_DIR/requirements.txt
shap
Expected behavior
shap is imported.
Screenshots or logs


System information A description of your system. Please provide:
- SageMaker Python SDK version: 2.3.0
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): sklearn
- Framework version: 0.23-1
- Python version: 3.7
- CPU or GPU: CPU
- Custom Docker image (Y/N): N
Additional context Add any other context about the problem here.
Hello @gavinmh
Thank you for using Amazon SageMaker. We are looking into your issue. Will get back to you with an update by 2020-08-19 17:00 Pacific time.
Best regards
Do you have any updates to share @metrizable ?
Hi @gavinmh, sorry for the delay. We're passing this along to the team that maintains the scikit-learn container.
@edwardjkim Would you be able to take a look?
Any updates @edwardjkim ?
HI @gavinmh, did the endpoint run successfully when it was deployed without installing requirements.txt? It looks like you are modifying the scikit-learn MME notebook which to my knowledge does not work with Python SDK 2.0. Could you try again by fixing the Python SDK version to pip install sagemaker==1.* (and possibly restarting the kernel)?
@fm1ch4 is the author of the notebook from the MME team. @fm1ch4 Can you please take a look?
Hi @gavinmh - can you please confirm if this issue still exists with the latest sagemaker ?