amazon-sagemaker-examples icon indicating copy to clipboard operation
amazon-sagemaker-examples copied to clipboard

[Bug Report] TabTransformer/Gluon auto SM built-in FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/input/data/training/train/data.csv'

Open Rhuax opened this issue 1 year ago • 0 comments

Describe the bug I'm trying to run a training experiment on sagemaker by using the newly added TabTransformer/AutoGluon-tabular algorithms. When the training starts it raises an exception "FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/input/data/training/train/data.csv'" No matter how I pass my csv file, it expects a hardcoded data.csv file. It doesn't work even if I rename my training set to "data.csv". To reproduce

from sagemaker import image_uris, model_uris, script_uris

train_model_id, train_model_version, train_scope = "autogluon-classification-ensemble", "*", "training"
training_instance_type = "ml.c4.2xlarge"

# Retrieve the docker image
train_image_uri = image_uris.retrieve(
    region=None,
    framework=None,
    model_id=train_model_id,
    model_version=train_model_version,
    image_scope=train_scope,
    instance_type=training_instance_type
)

# Retrieve the training script
train_source_uri = script_uris.retrieve(
    model_id=train_model_id, model_version=train_model_version, script_scope=train_scope
)

train_model_uri = model_uris.retrieve(
    model_id=train_model_id, model_version=train_model_version, model_scope=train_scope
)

training_dataset_s3_path = "s3://bucket/key/mytrain.csv"

s3_output_location = "s3://bucket/key2/"

from sagemaker import hyperparameters

# Retrieve the default hyper-parameters for training the model
hyperparameters = hyperparameters.retrieve_default(
    model_id=train_model_id, model_version=train_model_version
)

# [Optional] Override default hyperparameters with custom values
hyperparameters[
    "auto_stack"
] = "True"
print(hyperparameters)

from sagemaker.estimator import Estimator
from sagemaker.utils import name_from_base

training_job_name = "auto-gluon5"

# Create SageMaker Estimator instance
tabular_estimator = Estimator(
    role="myrole",
    image_uri=train_image_uri,
    source_dir=train_source_uri,
    model_uri=train_model_uri,
    entry_point="transfer_learning.py",
    instance_count=1,
    instance_type=training_instance_type,
    max_run=360000,
    hyperparameters=hyperparameters,
    output_path=s3_output_location
)

# Launch a SageMaker Training job by passing the S3 path of the training data
tabular_estimator.fit(
    {"training": training_dataset_s3_path}, logs=True, job_name=training_job_name
)

Logs I'm attaching my job logs. log-events-viewer-result(1).csv .

Rhuax avatar Jul 27 '22 12:07 Rhuax