amazon-sagemaker-examples
amazon-sagemaker-examples copied to clipboard
[Bug Report] TabTransformer/Gluon auto SM built-in FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/input/data/training/train/data.csv'
Describe the bug I'm trying to run a training experiment on sagemaker by using the newly added TabTransformer/AutoGluon-tabular algorithms. When the training starts it raises an exception "FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/input/data/training/train/data.csv'" No matter how I pass my csv file, it expects a hardcoded data.csv file. It doesn't work even if I rename my training set to "data.csv". To reproduce
from sagemaker import image_uris, model_uris, script_uris
train_model_id, train_model_version, train_scope = "autogluon-classification-ensemble", "*", "training"
training_instance_type = "ml.c4.2xlarge"
# Retrieve the docker image
train_image_uri = image_uris.retrieve(
region=None,
framework=None,
model_id=train_model_id,
model_version=train_model_version,
image_scope=train_scope,
instance_type=training_instance_type
)
# Retrieve the training script
train_source_uri = script_uris.retrieve(
model_id=train_model_id, model_version=train_model_version, script_scope=train_scope
)
train_model_uri = model_uris.retrieve(
model_id=train_model_id, model_version=train_model_version, model_scope=train_scope
)
training_dataset_s3_path = "s3://bucket/key/mytrain.csv"
s3_output_location = "s3://bucket/key2/"
from sagemaker import hyperparameters
# Retrieve the default hyper-parameters for training the model
hyperparameters = hyperparameters.retrieve_default(
model_id=train_model_id, model_version=train_model_version
)
# [Optional] Override default hyperparameters with custom values
hyperparameters[
"auto_stack"
] = "True"
print(hyperparameters)
from sagemaker.estimator import Estimator
from sagemaker.utils import name_from_base
training_job_name = "auto-gluon5"
# Create SageMaker Estimator instance
tabular_estimator = Estimator(
role="myrole",
image_uri=train_image_uri,
source_dir=train_source_uri,
model_uri=train_model_uri,
entry_point="transfer_learning.py",
instance_count=1,
instance_type=training_instance_type,
max_run=360000,
hyperparameters=hyperparameters,
output_path=s3_output_location
)
# Launch a SageMaker Training job by passing the S3 path of the training data
tabular_estimator.fit(
{"training": training_dataset_s3_path}, logs=True, job_name=training_job_name
)
Logs I'm attaching my job logs. log-events-viewer-result(1).csv .