sagemaker-python-sdk
sagemaker-python-sdk copied to clipboard
Specifying image_uri in PyTorchModel gives TypeError when running deploy
Describe the bug
When creating a PyTorchModel and deploying to endpoint, using a specified image_uri
, the model object is has attribute self.framework_version=None
. In the check for _is_mms_version
this will cause an error because of running a regex search with an input of type None instead of string or byte.
To reproduce
model = PyTorchModel(model_data=model_artifact,
name=name_from_base('model'),
role=role,
entry_point="torchserve-predictor.py",
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:1.7.1-cpu-py36-ubuntu18.04",
)
predictor = model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge', endpoint_name=endpoint_name)
Expected behavior
I expect the behavior to be the same as when providing framework_version
and py_version
into the creation of a PyTorchModel
Screenshots or logs
~/.pyenv/versions/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, **kwargs)
740 self._base_name = "-".join((self._base_name, compiled_model_suffix))
741
--> 742 self._create_sagemaker_model(instance_type, accelerator_type, tags)
743 production_variant = sagemaker.production_variant(
744 self.name, instance_type, initial_instance_count, accelerator_type=accelerator_type
~/.pyenv/versions/lib/python3.6/site-packages/sagemaker/model.py in _create_sagemaker_model(self, instance_type, accelerator_type, tags)
306 /api/latest/reference/services/sagemaker.html#SageMaker.Client.add_tags
307 """
--> 308 container_def = self.prepare_container_def(instance_type, accelerator_type=accelerator_type)
309
310 self._ensure_base_name_if_needed(container_def["Image"])
~/.pyenv/versions/lib/python3.6/site-packages/sagemaker/pytorch/model.py in prepare_container_def(self, instance_type, accelerator_type)
237
238 deploy_key_prefix = model_code_key_prefix(self.key_prefix, self.name, deploy_image)
--> 239 self._upload_code(deploy_key_prefix, repack=self._is_mms_version())
240 deploy_env = dict(self.env)
241 deploy_env.update(self._framework_env_vars())
~/.pyenv/versions/lib/python3.6/site-packages/sagemaker/pytorch/model.py in _is_mms_version(self)
282 """
283 lowest_mms_version = packaging.version.Version(self._LOWEST_MMS_VERSION)
--> 284 framework_version = packaging.version.Version(self.framework_version)
285 return framework_version >= lowest_mms_version
~/.pyenv/versions/lib/python3.6/site-packages/packaging/version.py in __init__(self, version)
294
295 # Validate the version and parse it into pieces
--> 296 match = self._regex.search(version)
297 if not match:
298 raise InvalidVersion("Invalid version: '{0}'".format(version))
TypeError: expected string or bytes-like object
System information A description of your system. Please provide:
- SageMaker Python SDK version: 2.29.1
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): Pytorch
- Framework version: 1.7.1
- Python version: 3.6.12
- CPU or GPU: CPU
- Custom Docker image (Y/N): N
Thanks
I was able to replicate the bug with the following system information:
SageMaker Python SDK version: 2.41.0
Framework name (eg. PyTorch) or algorithm (eg. KMeans): Pytorch
Framework version: 1.7.1
Python version: 3.6.12
CPU or GPU: CPU
Custom Docker image (Y/N): N
Also reported it to AWS support on the 20th of May.
Affects me as well, workaround seems to be to just provide a dummy version, but an annoying bug all the same.
Same to me!
Same for me. For the huggingface predictor it actually works, but it doesn't use the image I built but rather the default one...
Update: After figuring out how to work with the repository for sagemaker images I was able to get my problems fixed (which have been solely regarding the HuggingfaceModel
not being able to load custom images or to run them: https://github.com/aws/deep-learning-containers
dummy version, but an annoying bug all the same
Hi do you have a example for your work around?
dummy version, but an annoying bug all the same
Hi do you have a example for your work around?
As far as I remember I just added the parameter framework_version="1.8.1"
I can't believe that this issue is still open. The way how AWS issues get ignored by Amazon developers is rather disappointing.
dummy version, but an annoying bug all the same
Hi do you have a example for your work around?
As far as I remember I just added the parameter
framework_version="1.8.1"
I can't believe that this issue is still open. The way how AWS issues get ignored by Amazon developers is rather disappointing.
Thanks this seems to work for me as well.... I hope they fix it soon 😄
I hope they fix it soon 😄
Me looking at my inbox and laughing frenetically: No.
Does your framework_version="1.8.1"
solution definitely call the image from image_uri rather than fetching a different image via the framework_version arg?
@Michael-Bar i have same question. did you solved this problem?
Does your
framework_version="1.8.1"
solution definitely call the image from image_uri rather than fetching a different image via the framework_version arg?
Hi all,
https://github.com/aws/sagemaker-python-sdk/pull/3188 partially has addressed the problem.
Still, some ambiguity remains for the specification of Models
if py_version
, framework_version
and image_uri
are all passed.
Closing as fixed by #3188