sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

XGBoost estimator, security_group_ids parameter doesn't work.

Open dmitrybugakov opened this issue 4 years ago • 5 comments

Describe the bug I'm trying to submit a job from a local environment to Sagemaker in script mode, with custom VPC. I'm using "sagemaker.xgboost.estimator.XGBoost" and security_group_ids parameter. Job is submitted successfully, but without any changes regarding VPC, also I did notice non-errors in a log.

To reproduce

from sagemaker.session import Session
from sagemaker.inputs import TrainingInput
from sagemaker.xgboost.estimator import XGBoost
import boto3

boto_session = boto3.Session()
session = Session(boto_session=boto_session)

entry_point=""
source_dir="src.tar.gz"
role=""
instance_type=""
framework_version="1.0-1"
instance_count=None
security_group_ids=[""]

xgb_estimator = XGBoost(
    entry_point=entry_point,
    source_dir=source_dir,
    role=role,
    security_group_ids=security_group_ids,
    instance_count=instance_count,
    instance_type=instance_type,
    framework_version=framework_version
)

xgb_estimator.fit()

Expected behavior There are non-errors, also as non-changes regarding VPC. No custom VPC settings applied.

dmitrybugakov avatar Jan 24 '21 19:01 dmitrybugakov

@dmitrybugakov Thank you for using Amazon SageMaker.

Which python version do you see the bug on?

ahsan-z-khan avatar Jan 25 '21 21:01 ahsan-z-khan

@ahsan-z-khan

python==3.8.6
boto3==1.16.59
sagemaker==2.24.0

dmitrybugakov avatar Jan 26 '21 07:01 dmitrybugakov

Hey @dmitrybugakov,

Apologies on the late response.

In order to use custom security groups, you will need to also provide corresponding subnets.

The Python SDK determines this during fit in this line: https://github.com/aws/sagemaker-python-sdk/blob/e08c04e6ed0fdfb7e9e873d119769509f3ed74de/src/sagemaker/job.py#L81

Which ends up calling: https://github.com/aws/sagemaker-python-sdk/blob/e08c04e6ed0fdfb7e9e873d119769509f3ed74de/src/sagemaker/estimator.py#L1245 and runs into this conditional: https://github.com/aws/sagemaker-python-sdk/blob/e08c04e6ed0fdfb7e9e873d119769509f3ed74de/src/sagemaker/vpc_utils.py#L26

This is not a nice user experience, as it should fail if the user provides one of the two required inputs.

I have made note of this issue on our end.

ChoiByungWook avatar Mar 02 '21 01:03 ChoiByungWook

@ChoiByungWook thank you! In case, I will have free time, and the issue will still exist, I'm going to fix that.

dmitrybugakov avatar Mar 03 '21 11:03 dmitrybugakov

I am seeing that the subnet and security groups do not appear in the console (even when both are set using the xgboost estimator client). Seems like there not being forwarded. If I set them manually in the console I don't get errors so I don't think it is due silent failure

pbendevis avatar Jun 01 '22 17:06 pbendevis