aws-cdk icon indicating copy to clipboard operation
aws-cdk copied to clipboard

[aws-stepfunctions-tasks] Add support for SageMaker Processing

Open tuliocasagrande opened this issue 4 years ago • 8 comments

SageMaker and Step Functions released a new integration to create processing jobs directly in the state machine:

  • https://docs.aws.amazon.com/step-functions/latest/dg/connect-sagemaker.html#sagemaker-example-processing
  • https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob.html

Use Case

SageMaker Processing is a very flexible API call that can be used to run preprocessing/post-processing and therefore fully automate a ML use case using Step Functions.

Proposed Solution

Use https://github.com/aws/aws-cdk/blob/master/packages/%40aws-cdk/aws-stepfunctions-tasks/lib/sagemaker/create-training-job.ts as starting point.

Other

  • [ ] :wave: I may be able to implement this feature request

This is a :rocket: Feature Request

tuliocasagrande avatar Aug 08 '20 12:08 tuliocasagrande

To make sure that effort isn't duplicated:

  • I have implemented SageMaker's CreateProcessingJob locally
  • I expect to have time to clean up my implementation & submit a pull request for this issue (#9537) by 10-19-2020.

heatsink avatar Oct 12 '20 19:10 heatsink

Hello, any updates here ?

ihorfito avatar Feb 24 '21 14:02 ihorfito

Hi! Any updates here? :)

AustinGomez avatar Oct 01 '21 19:10 AustinGomez

The furthest-along PR that would close this issue is #14633 but I am of the opinion that a design overhaul is needed. @kaizen3031593 can take another look to see if he agrees or if the PR can be progressed in its current state

BenChaimberg avatar Oct 02 '21 00:10 BenChaimberg

@kaizen3031593 is there any further update for this PR please, I have a customer who needs to use this functionality in CDK

spssmn-aws avatar Jan 14 '22 15:01 spssmn-aws

@spssmn-aws this is not in my immediate roadmap, but I would be happy to field community contributions on this. It looks like the PR #14633 is pretty stale at this point and didn't really get super far along in the API design process. Since this is a fairly complicated ask, the first step would be to iterate over the API design a few times before we get into the weeds of the implementation.

For anyone who needs this task (or any other task that doesn't have native stepfunctions-task support), you can create a custom state to do what you need to do.

As always, +1s can help me change my mind :).

kaizencc avatar Jan 14 '22 15:01 kaizencc

@kaizencc is there a reason why create-training-job extends sfn.TaskStateBase instead of using the custom-state approach?

I need to implement the CreateProcessingJob API for my team's requirements. Was wondering if there is a reason why one should prefer using sfn.TaskStateBase over customState?

ighosh98 avatar Sep 21 '22 10:09 ighosh98

@ighosh98 you can think of a custom state as a lower level API. If you want to just supply the properties in pseudo-json, custom state works well for you. If you want to create a new stepfunction task, then you extend sfn.TaskStateBase and build off of that.

kaizencc avatar Sep 21 '22 14:09 kaizencc

Bumping - any updates? Our team would benefit from this too

athewsey avatar May 30 '23 07:05 athewsey

This issue has received a significant amount of attention so we are automatically upgrading its priority. A member of the community will see the re-prioritization and provide an update on the issue.

github-actions[bot] avatar Jun 16 '24 00:06 github-actions[bot]