sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

Add ContainerArguments to sagemaker.estimator.Estimator

Open tchaton opened this issue 3 years ago • 1 comments

Describe the feature you'd like A clear and concise description of the functionality you want.

Allow to pass arguments to ContainerEntrypoint for sagemaker.estimator.Estimator as ContainerArguments similar to ScriptProcessor

ContainerEntrypoint should be flexible enough to enable several arguments parser to be applied such as argparse or Hydra https://github.com/facebookresearch/hydra. I guess the simplest would be the pass arguments as a single string and to concatenate with train or serve ContainerEntrypoint.

How would this feature be used? Please describe. A clear and concise description of the use case for this feature. Please provide an example, if possible.

estimator = sagemaker.estimator.Estimator(image,
                       role, 
                        1,
                       'ml.c4.2xlarge',
                       output_path="s3://{}/output".format(sess.default_bucket()),
                       sagemaker_session=sess,
                       container_entrypoint={
                            "train":"+model_name=my_model --my_dataset=my_dataset ",
                            "serve":"+model_weight_selection=miou"
                        }
)

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Currently, I am using sed to overrides default missing arguments from hydra conf yaml file when building the docker image.

Additional context Add any other context or screenshots about the feature request here.

tchaton avatar Aug 16 '20 12:08 tchaton

Is there an update regarding this functionality? I also need this feature

dudulasry avatar Sep 05 '22 08:09 dudulasry

why not response? this feature should be provided.

zhazhn avatar Dec 10 '22 13:12 zhazhn

Is there an update regarding this functionality? I also need this feature.

Hello @zhazhn, do you find a workaround? This sagemaker team seems not to care about the community.

celsofranssa avatar Oct 16 '23 15:10 celsofranssa

Describe the feature you'd like A clear and concise description of the functionality you want.

Allow to pass arguments to ContainerEntrypoint for sagemaker.estimator.Estimator as ContainerArguments similar to ScriptProcessor

ContainerEntrypoint should be flexible enough to enable several arguments parser to be applied such as argparse or Hydra https://github.com/facebookresearch/hydra. I guess the simplest would be the pass arguments as a single string and to concatenate with train or serve ContainerEntrypoint.

How would this feature be used? Please describe. A clear and concise description of the use case for this feature. Please provide an example, if possible.

estimator = sagemaker.estimator.Estimator(image,
                       role, 
                        1,
                       'ml.c4.2xlarge',
                       output_path="s3://{}/output".format(sess.default_bucket()),
                       sagemaker_session=sess,
                       container_entrypoint={
                            "train":"+model_name=my_model --my_dataset=my_dataset ",
                            "serve":"+model_weight_selection=miou"
                        }
)

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Currently, I am using sed to overrides default missing arguments from hydra conf yaml file when building the docker image.

Additional context Add any other context or screenshots about the feature request here.

Hello @tchaton, do you find a workaround? This sagemaker team seems not to care about the community's needs.

celsofranssa avatar Oct 16 '23 15:10 celsofranssa

I have a related issue in #4197.

celsofranssa avatar Oct 16 '23 15:10 celsofranssa

@metrizable, any update on this?

celsofranssa avatar Oct 22 '23 18:10 celsofranssa

@martinRenou, any update on this?

celsofranssa avatar Oct 22 '23 18:10 celsofranssa

We have added your feature request it to our backlog of feature requests and may consider putting it into future SDK versions. I will go ahead and close the issue now, please let me know if you have any more feedback.

akrishna1995 avatar Dec 22 '23 17:12 akrishna1995

What a bummer. I was planning to start using Hydra to manage the config files. There is just so many irritating things around sagemaker, I should just look for an alternative.

ouj avatar Jan 26 '24 18:01 ouj

Workaround I have been using, if you are using SageMaker Pipelines:

  1. Create the pipeline then convert to json
pipeline = Pipeline(
        sagemaker_session=pipeline_session,
        name=pipeline_name,
        parameters=parameters.sagemaker_params,
        steps=step_list,
    )

pipeline_json = json.loads(pipeline.definition())
  1. At this point you can modify it according to the CreateTrainingJob API. Specifically set the ContainerEntrypoint.
for step in pipeline_json["Steps"]:
        name = step["Name"]
        if name == "TrainingStepThatNeedsContainerArgs":
            step["Arguments"]["AlgorithmSpecification"]["ContainerEntrypoint"] = ["train", "--arg1", "val1", "--arg2", "val2"]
  1. Create the pipeline using boto:
client = boto3_session.client("sagemaker")
    try:
        client.update_pipeline(
            PipelineName=...,
            PipelineDefinition=json.dumps(pipeline),
            PipelineDescription=...,
            RoleArn=...,
        )
    except botocore.exceptions.ClientError:
        client.create_pipeline(
                PipelineName=...,
                PipelineDefinition=json.dumps(pipeline),
                PipelineDescription=...,
                RoleArn=...,
            )
    client.start_pipeline_execution(
        PipelineName=...,
        PipelineExecutionDisplayName=...,
        ClientRequestToken=str(uuid.uuid4()),
    )

I basically do this to make SageMaker Pipelines a custom docker job orchestrator with additional features.

matthost avatar Feb 08 '24 23:02 matthost

I wanted to pass arguments into the Estimator. The way I found is:

estimator = Estimator(...
                      container_entry_point=["/usr/bin/python3"],
                      container_arguments=["/opt/ml/code/training.py", "--instances", "test,test"]

Maybe it helps somebody.

tikr7 avatar Jul 03 '24 09:07 tikr7