aws-step-functions-data-science-sdk-python
aws-step-functions-data-science-sdk-python copied to clipboard
timestamp mismatch when using code_location
HI,
When code_location is used in estimator of TrainingStep(), the uploaded s3 path and sagemaker_submit_directory timestamp do not match(about 400 ms). This will cause the execution to fail.
In SageMaker training job, timestamp matches even if code_location is used.
S3 uploaded path s3://my-bucket/model/sagemaker-xgboost-2020-06-10-06-29-37-910/source/sourcedir.tar.gz
sagemaker_submit_directory "s3://my-bucket/model/sagemaker-xgboost-2020-06-10-06-29-38-323/source/sourcedir.tar.gz"
# Open Source distributed script mode
from sagemaker.session import s3_input, Session
from sagemaker.xgboost.estimator import XGBoost
boto_session = boto3.Session(region_name=region)
session = Session(boto_session=boto_session)
output_path = 's3://{}/{}'.format(bucket_name, 'model')
xgb_script_mode_estimator = XGBoost(
entry_point='xgboost.py',
source_dir='source',
framework_version='0.90-2', # Note: framework_version is mandatory
hyperparameters=hyperparams,
role=role,
train_instance_count=1,
train_instance_type='ml.m5.2xlarge',
code_location=output_path, # ← Cause a mismatch
output_path=output_path
)
Hi @AtsunoriFujita, Sorry for the late response!
Thank you for bringing this to our attention - we will need to provide a fix to have consistent behaviour with SageMaker training job.
Tagging this as a bug