sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

Allow storing inputs and outputs of FrameworkProcessor cross-accounts

Open ksonbol opened this issue 3 years ago • 1 comments

Describe the feature you'd like Definitions:

  • Account A: where I will run SageMaker FrameworkProcessor jobs.
  • Account B: where data exists that I need to access for ProcessingInput.source and ProcessingOutput.destination
  • Role A: IAM role in account A that has SageMakerFullAccess policy attached.
  • Role B: IAM role in acount B that has S3 read and write access to the needed data. Role A can assume role B, and role B has a trusted relationship with Role A.

Problem: I would like to create a FrameworkProcessor (specifically a TensorFlowProcessor) instance that can run on Account A but read and write data to Account B to avoid having to copy data back and forth between the two accounts.

How would this feature be used? Please describe. A role parameter could be added to the ProcessingInput and ProcessingOutput classes that would be assumed before accessing the data.

processor = TensorFlowProcessor(role=role_A,...)
processor.run(
    inputs=[ProcessingInput(source=.., destination=.., role=role_B)],
    outputs=[ProcessingOutput(source=..,destination=.., role=role_B)],
    ...
)

Describe alternatives you've considered There is a role parameter in the TensorFlowProcessor constructor. However,

  • Using Role A fails with error No S3 objects found under S3 URL...: reason is object exists in Account B not A!
  • Using Role B fails with error:

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateProcessingJob operation: RoleArn: Cross-account pass role is not allowed.

Reason: we need the SageMaker permissions on Account A to be defined in the role.

How can I tell SageMaker to use Role A for creating and running the processing job but to assume role B to access the datasets in account B? For example, I am able to do that easily in SageMaker notebooks.

Please let me know if there is a way to achieve that with the current FrameworkProcessor implementation.

ksonbol avatar Feb 08 '22 17:02 ksonbol

I have the same problem recently. Is there any update on this request?

Gandor26 avatar Oct 17 '23 20:10 Gandor26