sagemaker-python-sdk Add ContainerArguments to sagemaker.estimator.Estimator

Add ContainerArguments to sagemaker.estimator.Estimator

Open tchaton opened this issue 3 years ago • 1 comments

Describe the feature you'd like A clear and concise description of the functionality you want.

Allow to pass arguments to ContainerEntrypoint for sagemaker.estimator.Estimator as ContainerArguments similar to ScriptProcessor

ContainerEntrypoint should be flexible enough to enable several arguments parser to be applied such as argparse or Hydra https://github.com/facebookresearch/hydra. I guess the simplest would be the pass arguments as a single string and to concatenate with train or serve ContainerEntrypoint.

How would this feature be used? Please describe. A clear and concise description of the use case for this feature. Please provide an example, if possible.

estimator = sagemaker.estimator.Estimator(image,
                       role, 
                        1,
                       'ml.c4.2xlarge',
                       output_path="s3://{}/output".format(sess.default_bucket()),
                       sagemaker_session=sess,
                       container_entrypoint={
                            "train":"+model_name=my_model --my_dataset=my_dataset ",
                            "serve":"+model_weight_selection=miou"
                        }
)

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Currently, I am using sed to overrides default missing arguments from hydra conf yaml file when building the docker image.

Additional context Add any other context or screenshots about the feature request here.

Aug 16 '20 12:08 tchaton

Is there an update regarding this functionality? I also need this feature

Sep 05 '22 08:09 dudulasry

why not response? this feature should be provided.

Dec 10 '22 13:12 zhazhn

Is there an update regarding this functionality? I also need this feature.

Hello @zhazhn, do you find a workaround? This sagemaker team seems not to care about the community.

Oct 16 '23 15:10 celsofranssa

Describe the feature you'd like A clear and concise description of the functionality you want.

Allow to pass arguments to ContainerEntrypoint for sagemaker.estimator.Estimator as ContainerArguments similar to ScriptProcessor

ContainerEntrypoint should be flexible enough to enable several arguments parser to be applied such as argparse or Hydra https://github.com/facebookresearch/hydra. I guess the simplest would be the pass arguments as a single string and to concatenate with train or serve ContainerEntrypoint.

How would this feature be used? Please describe. A clear and concise description of the use case for this feature. Please provide an example, if possible.
estimator = sagemaker.estimator.Estimator(image,
                       role, 
                        1,
                       'ml.c4.2xlarge',
                       output_path="s3://{}/output".format(sess.default_bucket()),
                       sagemaker_session=sess,
                       container_entrypoint={
                            "train":"+model_name=my_model --my_dataset=my_dataset ",
                            "serve":"+model_weight_selection=miou"
                        }
)
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Currently, I am using sed to overrides default missing arguments from hydra conf yaml file when building the docker image.

Additional context Add any other context or screenshots about the feature request here.

Hello @tchaton, do you find a workaround? This sagemaker team seems not to care about the community's needs.

Oct 16 '23 15:10 celsofranssa

I have a related issue in #4197.

Oct 16 '23 15:10 celsofranssa

@metrizable, any update on this?

Oct 22 '23 18:10 celsofranssa

@martinRenou, any update on this?

Oct 22 '23 18:10 celsofranssa

We have added your feature request it to our backlog of feature requests and may consider putting it into future SDK versions. I will go ahead and close the issue now, please let me know if you have any more feedback.

Dec 22 '23 17:12 akrishna1995

What a bummer. I was planning to start using Hydra to manage the config files. There is just so many irritating things around sagemaker, I should just look for an alternative.

Jan 26 '24 18:01 ouj

Workaround I have been using, if you are using SageMaker Pipelines:

Create the pipeline then convert to json

pipeline = Pipeline(
        sagemaker_session=pipeline_session,
        name=pipeline_name,
        parameters=parameters.sagemaker_params,
        steps=step_list,
    )

pipeline_json = json.loads(pipeline.definition())

At this point you can modify it according to the CreateTrainingJob API. Specifically set the ContainerEntrypoint.

for step in pipeline_json["Steps"]:
        name = step["Name"]
        if name == "TrainingStepThatNeedsContainerArgs":
            step["Arguments"]["AlgorithmSpecification"]["ContainerEntrypoint"] = ["train", "--arg1", "val1", "--arg2", "val2"]

Create the pipeline using boto:

client = boto3_session.client("sagemaker")
    try:
        client.update_pipeline(
            PipelineName=...,
            PipelineDefinition=json.dumps(pipeline),
            PipelineDescription=...,
            RoleArn=...,
        )
    except botocore.exceptions.ClientError:
        client.create_pipeline(
                PipelineName=...,
                PipelineDefinition=json.dumps(pipeline),
                PipelineDescription=...,
                RoleArn=...,
            )
    client.start_pipeline_execution(
        PipelineName=...,
        PipelineExecutionDisplayName=...,
        ClientRequestToken=str(uuid.uuid4()),
    )

I basically do this to make SageMaker Pipelines a custom docker job orchestrator with additional features.

Feb 08 '24 23:02 matthost

I wanted to pass arguments into the Estimator. The way I found is:

estimator = Estimator(...
                      container_entry_point=["/usr/bin/python3"],
                      container_arguments=["/opt/ml/code/training.py", "--instances", "test,test"]

Maybe it helps somebody.

Jul 03 '24 09:07 tikr7

sagemaker-python-sdk sagemaker-python-sdk copied to clipboard

Add ContainerArguments to sagemaker.estimator.Estimator

sagemaker-python-sdk
sagemaker-python-sdk copied to clipboard