Local deployment is not working on Windows 10
Describe the bug Trained model artifacts are not downloaded from S3 during deploy on Windows 10.
To reproduce Demonstrated on example from https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_script_mode_training_and_serving/tensorflow_script_mode_training_and_serving.ipynb
import sagemaker
from sagemaker import get_execution_role
from sagemaker.tensorflow import TensorFlow
sagemaker_session = sagemaker.Session()
role = get_execution_role()
region = sagemaker_session.boto_session.region_name
training_data_uri = 's3://sagemaker-sample-data-{}/tensorflow/mnist'.format(region)
mnist_estimator2 = TensorFlow(entry_point='mnist2.py',
role=role,
train_instance_count=1,
train_instance_type='local',
framework_version='2.0.0',
py_version='py3')
mnist_estimator2.fit(training_data_uri)
predictor2 = mnist_estimator2.deploy(initial_instance_count=1, instance_type='local')
Expected behavior Model artifacts should be downloaded from S3 and accessible to serving container.
Screenshots or logs

System information A description of your system. Please provide:
- SageMaker Python SDK version:
sagemaker==1.50.10.post0 - Framework name (eg. PyTorch) or algorithm (eg. KMeans):TensorFlow
- Framework version:2.0
- Python version:
3.7.6 - CPU or GPU:CPU
- Custom Docker image (Y/N):N
Additional context
Problems comes from obtaining S3ModelArtifacts path in
\sagemaker\local\image.py
in method
def retrieve_artifacts(self, compose_data, output_data_config, job_name)
is artifact path returned using simple
return os.path.join(output_data, "model.tar.gz")
if this is called on Windows it produces something like:
../tensorflow-training-2020-02-19-15-57-14-207\model.tar.gz
when Sagemaker tries to download artifacts from S3 afterwards in
\sagemaker\utils.py
using method
def download_folder(bucket_name, prefix, target, sagemaker_session):
it fails to retrieve files calling
bucket.objects.filter(Prefix=prefix)
because of the \ in front of model.tar.gz
Thank you for submitting a detailed bug report. It appears this issue was fixed in https://github.com/aws/sagemaker-python-sdk/pull/1302, which was released in v1.50.14.
Please try updating your version of the SageMaker Python SDK.
Hi, thank you for reaction. I tried version 1.50.16.dev0 and the problem still remains. It looks like that metioned fix was for the similiar problem but in different place.
Code of the method for retrieving model artifacts ( retrieve_artifacts in sagemaker-python-sdk-master/src/sagemaker/local/image.py ) is still using return os.path.join(output_data, "model.tar.gz")
Windows Support for Local Mode has been Experimental and unfortunately has never been fully supported or tested.
Marking with a feature request label.
Could you please provide more details about your use case for using local mode? That would help us a lot when prioritizing the roadmap.
Thank you!
Could you please provide more details about your use case for using local mode? That would help us a lot when prioritizing the roadmap.
Thank you!
Hi, I thought I'd bring some more notice to this.
The use case for me is for local testing. Currently, the only way to test is by using instances, storage, etc. It's noted in the blog here which are also perfectly valid use cases for windows machines. It can take some time just to debug as well as an extra expense.