sagemaker-python-sdk
sagemaker-python-sdk copied to clipboard
Support retaining current desired instance count when updating endpoint
Describe the feature you'd like
The default behaviour of SageMaker Python SDK when updating an inference endpoint is to throw away whatever value for desired instance count there is currently at runtime for the endpoint (according the endpoint autoscaling policy). Please add support for the SageMaker UpdateEndpoint API boolean parameter RetainAllVariantProperties
as a way to solve this issue, see https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpoint.html#API_UpdateEndpoint_RequestSyntax
How would this feature be used? Please describe. I would provide RetainAllVariantProperties=True to update an endpoint while retaining whatever is the current runtime autoscaling policy desired instance count.
Describe alternatives you've considered Use boto3 as a workaround
Additional context Add any other context or screenshots about the feature request here.