amazon-sagemaker-examples
amazon-sagemaker-examples copied to clipboard
Sagemaker Processors base_job_name argument not working
Describe the bug Even though the base_job_name argument is set in the Processor definition, for instance sagemaker.sklearn.processing.SKLearnProcessor, the resulting processing job created has a totally different name.
To reproduce To simplify, it's possible to use the abalone pipeline example and give a custom base_job_name to the SKLearnProcessor. The result should be a ProcessingJob created with a name not compliant with the given job name, such as pipelines-kytlemm1lvpq-PreprocessingStep-cIpzShs3Qp
Can you please link to the notebook and point out any specific lines for the jobs/variables you're referring to?
Hi,
I have a similar issue with training and transform steps.
train_step = TrainingStep(
name=f'my-train',
estimator=Estimator(
image_uri='...',
base_job_name='base-train',
instance_type='ml.m5.large',
instance_count=1,
volume_size=1,
max_run=200,
output_path='...',
subnets=[...],
security_group_ids=[...],
disable_profiler=True,
sagemaker_session=sagemaker_session,
role=role,
),
inputs={
'training': TrainingInput(s3_data='...', content_type='application/json'),
},
)
test = Pipeline(
name='my-pipeline',
steps=[train_step],
sagemaker_session=sagemaker_session,
)
test.upsert(role_arn=role)
exec = test.start(execution_display_name='my-exec')
exec.describe()
The generate name for the training job is: pipelines-iwvdptc7f9c2-my-train-hwEq0s3KdT
and I would like something like the following: base-train-my-train-hwEq0s3KdT
voting for this to be resolved! All our training jobs have the same name and it's impossible to tell the jobs apart.
I'm having the same issue. I've just run an example notebook (https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-pipelines/tabular/local-mode) and even the base_job_name parameter used in it's not affecting Sagemaker training job name or Sagemaker processing job name.
If I print out the pipeline definition, it prints the Job Name without a problem. It's something like that: Job Name: sklearn-abalone-process-2022-12-02-08-59-37-161
The document mentions base_job_name argument - Prefix for processing job name. If not specified, the processor generates a default job name, based on the processing image name and current timestamp.
For our use-case, I don't want to create different S3 prefixes every time in the S3 bucket when the processing job runs. Currently the code gets written to <bucket_name>/<job_name>/input/code/preprocessing.py. I'm interested in using a format like <bucket_name>/processing_jobs/<job_name>/input/code/ instead of <bucket_name>/<job_name>/input/code/. I thought I could achieve this by passing the argument base_job_name, but it doesn't seem to be effective.
import boto3
import sagemaker
from sagemaker import get_execution_role
from sagemaker.sklearn.processing import SKLearnProcessor
role = sagemaker.get_execution_role()
region = sagemaker.Session().boto_region_name
sm_client = boto3.client("sagemaker")
boto_session = boto3.Session(region_name=region)
bucket = "xxxxxxxxxxxxxxxxxx"
sagemaker_session = sagemaker.session.Session(
boto_session=boto_session
, sagemaker_client=sm_client
, default_bucket=bucket
)
base_job_prefix= "sklearn-processor"
sklearn_processor = SKLearnProcessor(
framework_version="1.0-1"
, role=role
, instance_type="ml.m5.xlarge"
, instance_count=1
, base_job_name="processing_jobs/sklearn-census-preprocess"
, sagemaker_session = sagemaker_session
)
from sagemaker.processing import ProcessingInput, ProcessingOutput
sklearn_processor.run(
code="preprocessing.py",
inputs=[
ProcessingInput(
source=input_data
, destination="/opt/ml/processing/input"
, s3_input_mode="File"
, s3_data_distribution_type="FullyReplicated"
)
],
outputs=[
ProcessingOutput(
output_name="train_data"
, source="/opt/ml/processing/train"
, destination="s3://xxxxxxxxxxxxxxx/datasets/census/train_data/"
),
ProcessingOutput(
output_name="test_data"
, source="/opt/ml/processing/test"
, destination="s3:/xxxxxxxxxxxxxxx/datasets/census/test_data/"
),
],
arguments=["--train-test-split-ratio", "0.2"],
)
The Processing job is failing with -
ClientError: An error occurred (ValidationException) when calling the CreateProcessingJob operation: 1 validation error detected: Value 'processing_jobs/sklearn-census-preproce-2023-02-16-22-05-14-498' at 'processingJobName' failed to satisfy constraint: Member must satisfy regular expression pattern: ^a-zA-Z0-9{0,62}
Confirmed, this is happening for me as well. base_job_name
is not honored when running SKLearnProcessor
as a part of pipeline
It is happening for me as well but with the Processor
Hi guys! Someone found an explanation about the SKLearnProcessor as a part of pipeline? I'm facing two problems:
-
Getting a warning about the ProcessingJobName being popped out from the pipeline definition by default since it will be overridden at pipeline execution time. It recommends using the PipelineDefinitionConfig to persist the field in the pipeline definition if desired. I did not figure out how to set it aiming to test this. Does someone know?
-
I think that the previous message makes the pipeline crash at some point and I'm trying to figure out where it crashes. The main log result says that the object does not have the '_current_job_name' attribute. Bellow the entire log:
Does anyone had such a problem? Which tips do you suggest guys? I'm learning about it and I'm lost on it. Tkx in advance for the help.