sagemaker-python-sdk
sagemaker-python-sdk copied to clipboard
Add ContainerArguments to sagemaker.estimator.Estimator
Describe the feature you'd like A clear and concise description of the functionality you want.
Allow to pass arguments to ContainerEntrypoint for sagemaker.estimator.Estimator as ContainerArguments similar to ScriptProcessor
ContainerEntrypoint should be flexible enough to enable several arguments parser to be applied such as argparse or Hydra https://github.com/facebookresearch/hydra. I guess the simplest would be the pass arguments as a single string and to concatenate with train or serve ContainerEntrypoint.
How would this feature be used? Please describe. A clear and concise description of the use case for this feature. Please provide an example, if possible.
estimator = sagemaker.estimator.Estimator(image,
role,
1,
'ml.c4.2xlarge',
output_path="s3://{}/output".format(sess.default_bucket()),
sagemaker_session=sess,
container_entrypoint={
"train":"+model_name=my_model --my_dataset=my_dataset ",
"serve":"+model_weight_selection=miou"
}
)
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Currently, I am using sed to overrides default missing arguments from hydra conf yaml file when building the docker image.
Additional context Add any other context or screenshots about the feature request here.
Is there an update regarding this functionality? I also need this feature
why not response? this feature should be provided.
Is there an update regarding this functionality? I also need this feature.
Hello @zhazhn, do you find a workaround? This sagemaker team seems not to care about the community.
Describe the feature you'd like A clear and concise description of the functionality you want.
Allow to pass arguments to ContainerEntrypoint for sagemaker.estimator.Estimator as ContainerArguments similar to ScriptProcessor
ContainerEntrypoint should be flexible enough to enable several arguments parser to be applied such as argparse or Hydra https://github.com/facebookresearch/hydra. I guess the simplest would be the pass arguments as a single string and to concatenate with train or serve ContainerEntrypoint.
How would this feature be used? Please describe. A clear and concise description of the use case for this feature. Please provide an example, if possible.
estimator = sagemaker.estimator.Estimator(image, role, 1, 'ml.c4.2xlarge', output_path="s3://{}/output".format(sess.default_bucket()), sagemaker_session=sess, container_entrypoint={ "train":"+model_name=my_model --my_dataset=my_dataset ", "serve":"+model_weight_selection=miou" } )
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Currently, I am using sed to overrides default missing arguments from hydra conf yaml file when building the docker image.
Additional context Add any other context or screenshots about the feature request here.
Hello @tchaton, do you find a workaround? This sagemaker team seems not to care about the community's needs.
I have a related issue in #4197.
@metrizable, any update on this?
@martinRenou, any update on this?
We have added your feature request it to our backlog of feature requests and may consider putting it into future SDK versions. I will go ahead and close the issue now, please let me know if you have any more feedback.
What a bummer. I was planning to start using Hydra to manage the config files. There is just so many irritating things around sagemaker, I should just look for an alternative.
Workaround I have been using, if you are using SageMaker Pipelines:
- Create the pipeline then convert to json
pipeline = Pipeline(
sagemaker_session=pipeline_session,
name=pipeline_name,
parameters=parameters.sagemaker_params,
steps=step_list,
)
pipeline_json = json.loads(pipeline.definition())
- At this point you can modify it according to the CreateTrainingJob API. Specifically set the ContainerEntrypoint.
for step in pipeline_json["Steps"]:
name = step["Name"]
if name == "TrainingStepThatNeedsContainerArgs":
step["Arguments"]["AlgorithmSpecification"]["ContainerEntrypoint"] = ["train", "--arg1", "val1", "--arg2", "val2"]
- Create the pipeline using boto:
client = boto3_session.client("sagemaker")
try:
client.update_pipeline(
PipelineName=...,
PipelineDefinition=json.dumps(pipeline),
PipelineDescription=...,
RoleArn=...,
)
except botocore.exceptions.ClientError:
client.create_pipeline(
PipelineName=...,
PipelineDefinition=json.dumps(pipeline),
PipelineDescription=...,
RoleArn=...,
)
client.start_pipeline_execution(
PipelineName=...,
PipelineExecutionDisplayName=...,
ClientRequestToken=str(uuid.uuid4()),
)
I basically do this to make SageMaker Pipelines a custom docker job orchestrator with additional features.
I wanted to pass arguments into the Estimator. The way I found is:
estimator = Estimator(...
container_entry_point=["/usr/bin/python3"],
container_arguments=["/opt/ml/code/training.py", "--instances", "test,test"]
Maybe it helps somebody.