sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

metadata storage available to all pipeline steps (read+write)

Open jonathanhillwebsite opened this issue 2 years ago • 1 comments

Describe the feature you'd like Each step in the Pipeline will produce some sort of metadata, and this metadata may be needed in a later step. It would be a lot easier to pass a JSON file to each sequential step in the pipeline to store metadata. E.g., Number of samples [per class] in the processed training data (which would later be required in calculating number_of_steps in a tensorflow model) or can impact the decisions made in future steps (e.g. a conditional statement that says if eval_accuracy > n && samples_per_class > s).

How would this feature be used? Please describe. Much like how data is passed through AWS Step Functions, each sequential step can have access to metadata from past steps.

Describe alternatives you've considered I've created a file called metadata.json that I append data to in the source code of each step and then push these changes to S3, however this could potentially cause issues using eventually-consistent s3 updates.

Additional context Add any other context or screenshots about the feature request here.

jonathanhillwebsite avatar Jun 13 '22 13:06 jonathanhillwebsite

Hi @jonathanhillwebsite,

Thanks for the feedback. As you've noted, properties currently come from step outputs and can be referenced as inputs to another step, but there's something to be said for the convenience of a pipeline-level property bag like the one you've made with S3.

I've tagged this as a feature request and will keep it open for tracking. I can't commit to the feature or a timeline for it right now, but feel free to share any other thoughts on your use-case or how you'd envision this interface.

staubhp avatar Jun 24 '22 22:06 staubhp

Hi @jonathanhillwebsite,

Thanks for using SageMaker and taking the time to suggest ways to improve SageMaker Python SDK. We have added your feature request it to our backlog of feature requests and may consider putting it into future SDK versions. I will go ahead and close the issue now, please let me know if you have any more feedback. Let me know if you have any other questions.

Best, Shweta

ShwetaSingh801 avatar Dec 22 '23 08:12 ShwetaSingh801