sagemaker-python-sdk
sagemaker-python-sdk copied to clipboard
Allow storing inputs and outputs of FrameworkProcessor cross-accounts
Describe the feature you'd like Definitions:
- Account A: where I will run SageMaker FrameworkProcessor jobs.
- Account B: where data exists that I need to access for
ProcessingInput.source
andProcessingOutput.destination
- Role A: IAM role in account A that has
SageMakerFullAccess
policy attached. - Role B: IAM role in acount B that has S3 read and write access to the needed data. Role A can assume role B, and role B has a trusted relationship with Role A.
Problem:
I would like to create a FrameworkProcessor
(specifically a TensorFlowProcessor
) instance that can run on Account A but read and write data to Account B to avoid having to copy data back and forth between the two accounts.
How would this feature be used? Please describe.
A role
parameter could be added to the ProcessingInput
and ProcessingOutput
classes that would be assumed before accessing the data.
processor = TensorFlowProcessor(role=role_A,...)
processor.run(
inputs=[ProcessingInput(source=.., destination=.., role=role_B)],
outputs=[ProcessingOutput(source=..,destination=.., role=role_B)],
...
)
Describe alternatives you've considered
There is a role
parameter in the TensorFlowProcessor
constructor. However,
- Using Role A fails with error
No S3 objects found under S3 URL...
: reason is object exists in Account B not A! - Using Role B fails with error:
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateProcessingJob operation: RoleArn: Cross-account pass role is not allowed.
Reason: we need the SageMaker permissions on Account A to be defined in the role.
How can I tell SageMaker to use Role A for creating and running the processing job but to assume role B to access the datasets in account B? For example, I am able to do that easily in SageMaker notebooks.
Please let me know if there is a way to achieve that with the current FrameworkProcessor
implementation.
I have the same problem recently. Is there any update on this request?