sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

Executing sagemaker.get_execution_role() locally

Open opringle opened this issue 5 years ago • 18 comments

Please fill out the form below.

System Information

  • Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): MXNet/None
  • Framework Version: 1.1.0
  • Python Version: 3.5
  • CPU or GPU: CPU
  • Python SDK Version: 1.7.0
  • Are you using a custom image: No

Describe the problem

  • I want to run SageMaker without a notebook instance, from a script on my local machine, for various reasons.
  • I can successfully start SageMaker jobs by passing the ARN string from my AWS role to my script
  • However, I cannot retrieve the ARN string programatically using sagemaker.get_execution_role(). Instead, I receive a botocore.errorfactory.NoSuchEntityException.

Minimal repro / logs

To reproduce the problem:

Script:

import sagemaker
import boto3

session = boto3.Session(profile_name='personal')
sagemaker_session = sagemaker.Session(boto_session=session)
role = sagemaker.get_execution_role(sagemaker_session=sagemaker_session)

Credentials:

[personal]
aws_secret_access_key = ******************
aws_access_key_id = *******************
region = us-west-2

Error:

Traceback (most recent call last):
  File "mwe.py", line 8, in <module>
    role = sagemaker.get_execution_role(sagemaker_session=sagemaker_session)
  File "/Users/opringle/.virtualenvs/vdcnn/lib/python3.6/site-packages/sagemaker/session.py", line 936, in get_execution_role
    arn = sagemaker_session.get_caller_identity_arn()
  File "/Users/opringle/.virtualenvs/vdcnn/lib/python3.6/site-packages/sagemaker/session.py", line 766, in get_caller_identity_arn
    role = self.boto_session.client('iam').get_role(RoleName=role_name)['Role']['Arn']
  File "/Users/opringle/.virtualenvs/vdcnn/lib/python3.6/site-packages/botocore/client.py", line 314, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Users/opringle/.virtualenvs/vdcnn/lib/python3.6/site-packages/botocore/client.py", line 612, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.NoSuchEntityException: An error occurred (NoSuchEntity) when calling the GetRole operation: The user with name oliver_pringle cannot be found.
  • Exact command to reproduce: pip install sagemaker && python mwe.py

opringle avatar Jul 17 '18 21:07 opringle

Hi @opringle ,

The problem is, the get_execution_role() method is only used on AWS SageMaker notebook instances. So if you use it locally, it won't correctly parse your credential (from your stacktrace, I think you are using IAM user credential).

So if you want to use sagemaker locally, you can create an IAM role with enough SageMaker access permission. Then just directly use that role in your code.

Feel free to reopen this if you have more questions.

Thanks

yangaws avatar Jul 30 '18 23:07 yangaws

This is really a pretty bad experience. get_execution_role() sounds like it's going to just figure out all the IAM/role/confusion/whatever to make SageMaker work. And on a notebook instance it does. But if you run that same code on your laptop it fails, sending customers into IAM/role/confusion limbo.

leopd avatar Nov 15 '18 23:11 leopd

Without this it's basically impossible to write a simple set of code that works both on a SageMaker notebook instance and anywhere else. Which is a real barrier to people who want to build the SageMaker ecosystem.

leopd avatar Nov 15 '18 23:11 leopd

understood. definitely agree that the SDK can do better here. I'll leave this issue open as a feature request, and hopefully we can prioritize this work in the near future. Thanks @leopd!

laurenyu avatar Nov 16 '18 00:11 laurenyu

Also having issues here, +1 to smoothing it out.

thomelane avatar Nov 21 '18 23:11 thomelane

same

Soypete avatar Dec 19 '18 23:12 Soypete

A temp solution is re-use the IAM role attached to your notebook (when you create the notebook, you had one there). You can get its arn from IAM console.

iluoyi avatar Dec 21 '18 07:12 iluoyi

I think local mode should work offline, what need is there to check credentials when running locally?

stevehawley avatar Mar 18 '19 18:03 stevehawley

I have written this super hacky function to resolve the sagemaker execution role. it may fail miserably, and you should probably not use it at all. But, it may work in simple cases:

def resolve_sm_role():
    client = boto3.client('iam', region_name=region)
    response_roles = client.list_roles(
        PathPrefix='/',
        # Marker='string',
        MaxItems=999
    )
    for role in response_roles['Roles']:
        if role['RoleName'].startswith('AmazonSageMaker-ExecutionRole-'):
            print('Resolved SageMaker IAM Role to: ' + str(role))
            return role['Arn']
    raise Exception('Could not resolve what should be the SageMaker role to be used')

gilinachum avatar Dec 14 '19 21:12 gilinachum

sagemaker.get_execution_role() could basically get the environment variable AWS_ROLE_SESSION_NAME as it's documented for credentials setup, and that would fit local processing too. But, sorry, all AWS IAM needs a refactoring

ricoms avatar Dec 20 '19 16:12 ricoms

Putting iluoyi's solution in code

try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client('iam')
    role = iam.get_role(RoleName='AmazonSageMaker-ExecutionRole-20191205T100050')['Role']['Arn']

A SageMaker execution role exists if you ever ran a job before, if not:

  1. Log onto the console -> IAM -> Roles -> Create Role
  2. Create a service-linked role with sagemaker.amazonaws.com
  3. Give the role AmazonSageMakerFullAccess
  4. Give the role AmazonS3FullAccess (<-- scope down if reasonable)

Then use the name in RoleName= like above

A potential long term solution would be to create a function that checks for an existing execution service role, if it does not exist, then create the new role.....but service-role creation with managed policies through boto3 IAM requires......patience....

NukaCody avatar Jan 24 '20 01:01 NukaCody

Any plans to fix this? This is very annoying if you want to execute notebooks locally. get_execution_role should create a default role with SM permissions when called out of a notebook.

larroy avatar Sep 09 '20 00:09 larroy

Nothing yet?

rodrigoheck avatar Jan 20 '21 04:01 rodrigoheck

Almost three years later and this is still an issue?

rapuckett avatar Apr 26 '21 16:04 rapuckett

Got today "The current AWS identity is not a role: arn:aws:iam::XXXXXXXXXX:user/xxxxxxxx, therefore it cannot be used as a SageMaker execution role."

TanjaNY avatar Apr 29 '21 16:04 TanjaNY

The above solution (https://github.com/aws/sagemaker-python-sdk/issues/300#issuecomment-577957428) is in docs now: https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html

cccntu avatar May 09 '21 10:05 cccntu

No update there? This issue is 4 years old ...

tchaton avatar Jun 30 '22 11:06 tchaton

Just stumbled across this issue. Will this issue ever be solved?

ghost avatar Aug 01 '22 14:08 ghost

Inside SageMaker we can have multiple notebook instances and each notebook instance can have a different IAM role. When running your code locally get_execution_role will not work since there might be several roles dedicated to different SageMaker notebook instances. Therefore, you have to choose which is the right role to use.

In order to make your code work in both local and remote modes, you could instantiate a variable containing the specific value of IAM role, and implement a try block like here below.

local_variable_for_sm_role = “arn:aws:iam::XXXX:role/service-role/XXXXX”
try:
    role = sagemaker.get_execution_role()
except ValueError:
    role = local_variable_for_sm_role

ioanfr avatar Feb 15 '23 15:02 ioanfr

It seems that sagemaker-python-sdk team does not care about the community issues.

celsofranssa avatar Oct 13 '23 14:10 celsofranssa

I got the same error. Tried everything, is it still an issue?

variable-ad avatar Apr 03 '24 19:04 variable-ad

I got the same error. Tried everything, is it still an issue?

I am getting around with: Created Sagemaker All Access Role and define role as the arn of this role, works for me. role = 'arn:aws:iam::ACCTNMRXXXX:role/SageMakerAllAccess'

TanjaNY avatar Apr 04 '24 06:04 TanjaNY