amazon-emr-vscode-toolkit icon indicating copy to clipboard operation
amazon-emr-vscode-toolkit copied to clipboard

Error fetching AWS credentials when using remote ssh + dev container vscode extensions

Open jlafaye opened this issue 3 years ago • 6 comments

Hello,

Trying to use the toolkit with the following setup

  • EMR toolkit v0.6.0
  • VSCode UI running on a windows desktop computer (no docker installed, per company policy)
  • Dev environment running on an EC2 instance
  • IAM role attached with the permissions required for running the job & accessing EMR clusters/applications

What works

  • Browsing EMR Clusters/Containers/Serverless & Glue Catalog when running in the SSH environment
  • Running aws sts get-caller-identity in the ssh+devcontainer environment. It returns the arn of the instance role through instance metadata (http://169.254.169.254:80 "GET /latest/meta-data/iam/security-credentials/emr_devex-docker-host-role HTTP/1.1" 200 1590)
  • Running the demo pyspark job in the ssh+devcontainer environment

What does not work

  • Browsing EMR Clusters/Containers/Serverless & Glue Catalog from the ssh+devcontainer environment. Those fail with the error below:

Error fetching EMR Serverless applications!CredentialsProviderError: Could not load credentials from any providers

It would be great if you could provide guidance on how to troubleshoot this. I would be happy to provide more details if needed. The toolkit is a great addition to VSCode and I'm sure it can ease the developer's lives.

jlafaye avatar Apr 10 '23 07:04 jlafaye

Hi @jlafaye - Thanks for opening the issue and apologies that things aren't working out for you. Just to clarify - you're SSH'ed into an EC2 instance and then also have a devcontainer/Docker environment running on that instance?

How does your devcontainer authenticate to AWS? In other words, if the IAM role is attached to the EC2 instance itself, how does the devcontainer make use of that role?

I haven't tried running this in a remote environment so bear with me. A couple things to try:

  1. If you choose EMR: Select AWS Profile from the command palette, are you provided with a list of profiles? And if so, are they from your local computer or the dev environment?
  2. Make sure you've selected the proper region as well with the EMR: Select AWS Region command.

If neither of those provide insight, I'll both try to set up a remote environment and add better error logging. At the moment, if you click on the OUTPUT tab in VS Code, there is an "Amazon EMR" section, but the current logs are just status logs.

dacort avatar Apr 18 '23 22:04 dacort

Hi @dacort - Thank you for taking the time to read my message.

My devcontainer authenticates to AWS through instance Metadata inherited from the instance the container is running on. I have set AWS_EC2_METADATA_DISABLED to false in devcontainer.json.

  1. EMR Select AWS Profile does not list any profile
  2. Changing to the correct region (eu-west-1 in my case) does not change anything.

Sorry not being able to provide more info.

jlafaye avatar Apr 21 '23 17:04 jlafaye

Just leaving some debugging notes:

  • Confirmed that on an EC2 instance with docker installed, aws s3 ls works
  • Confirmed that with docker run --rm -it amazonlinux:2, aws s3 ls works
  • Confirmed that with docker run --rm -it --entrypoint /bin/bash 895885662937.dkr.ecr.us-west-2.amazonaws.com/spark/emr-6.10.0, aws s3 ls works

Next, need to try setting up ssh + devcontainer.

  • Confirmed that EMR extension can list Glue tables with proper policy attached to EC2 instance
  • Confirmed that container created from Create local Spark environment command and None for credential option is able to aws s3 ls
  • Confirmed that vscode devcontainer created can not access aws s3 ls with "AWS_EC2_METADATA_DISABLED": "true",
  • Confirmed that vscode devcontainer created can access aws s3 ls with "AWS_EC2_METADATA_DISABLED": "false",
  • ~Confirmed that EMR extension can list Glue tables when AWS_EC2_METADATA_DISABLED is changed to false.~
  • EMR extension can not list Glue tables even when AWS_EC2_METADATA_DISABLED is changed to false.

For some reason looks like AwsCredentialIdentityProvider in aws_context.ts isn't finding the instance credentials.

The code I used to debug.
import { GlueClient, GetDatabasesCommand } from "@aws-sdk/client-glue";
import { fromInstanceMetadata } from '@aws-sdk/credential-providers';

console.log('----------------------------------')
const credentials = await fromInstanceMetadata({ timeout: 1000, maxRetries: 0, })();
console.log(credentials);

console.log('----------------------------------')
const glue = new GlueClient({ region: 'us-west-2' });
const result = await glue.send(new GetDatabasesCommand({}));
console.log(result.DatabaseList ?? []);

dacort avatar May 01 '23 22:05 dacort

Think I figured this out! 😮‍💨

Can you try removing AWS_EC2_METADATA_DISABLED entirely from the containerEnv section of your devcontainer.json file?

When the SDK tries to retrieve credentials from IMDS, it checks for that environment variable, but uses the following code:

if (process.env[ENV_IMDS_DISABLED])

Unfortunately, environment variables come in as strings so even if we set it to false or 0, it evaluates to true.

Leaving this issue open as I'd like to add IMDS as an auth option when creating the container.

dacort avatar May 02 '23 22:05 dacort