terraform-provider-aws
terraform-provider-aws copied to clipboard
Unable to invoke Lambda with environment variables due to KMS AccessDeniedException
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Terraform Version
- Terraform v0.11.10
- provider.aws v1.42.0
Affected Resource(s)
- aws_lambda_function
Terraform Configuration Files
resource "aws_iam_role" "myrole" {
name = "terraform-kms-test"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
}
resource "aws_iam_role_policy_attachment" "basic_exec" {
role = "${aws_iam_role.myrole.name}"
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}
resource "aws_lambda_function" "myfunction" {
filename = "build.zip"
function_name = "terraform-kms-test"
role = "${aws_iam_role.myrole.arn}"
handler = "index.handler"
source_code_hash = "${base64sha256(file("build.zip"))}"
runtime = "nodejs8.10"
environment {
variables {
MY_CONFIG = "config value"
}
}
}
Expected Behavior
Function using environment variables should be invocable after it's role name is changed.
Actual Behavior
- On initial deployment, the function is able to be invoked without any errors
- But if you change the IAM role name and rerun
terraform apply
, invoking the function returns the following error:
Calling the invoke API action failed with this message: Lambda was unable to decrypt the environment variables because KMS access was denied. Please check the function's KMS key settings. KMS Exception: AccessDeniedExceptionKMS Message: The ciphertext refers to a customer master key that does not exist, does not exist in this region, or you are not allowed to access.
Steps to Reproduce
-
terraform apply
- Change
aws_iam_role
name to something different -
terraform apply
References
- Seems to be a related issue #4633
I'm also experiencing this issue and looking in CloudTrail, I do see KMS CreateGrant calls being made. It's not clear why this is being made as in this case, no KMS keys are being specified for encrypting environment variables.
Part of the issue is that it appears to be accessing a KMS key that no longer exists as that key is also removed by Terraform.
Here is the example:
{
"requestParameters": {
"operations": [
"Decrypt",
"RetireGrant"
],
"granteePrincipal": "arn:aws:sts::ACCOUNT:assumed-role/LAMBDA-IAM-ROLE/NAME-OF-THE-LAMBDA-FUNCTION",
"keyId": "arn:aws:kms:REGION:ACCOUNT:key/SOME-KEY-ID",
"constraints": {
"encryptionContextEquals": {
"aws:lambda:FunctionArn": "arn:aws:lambda:REGION:ACCOUNT:function:NAME-OF-THE-LAMBDA-FUNCTION"
}
},
"retiringPrincipal": "arn:aws:sts::ACCOUNT:assumed-role/LAMBDA-IAM-ROLE/NAME-OF-THE-LAMBDA-FUNCTION"
},
"eventType": "AwsApiCall",
"responseElements": {
"grantId": "SOME-GRANT-ID"
},
"awsRegion": "REGION",
"eventName": "CreateGrant",
"readOnly": false,
"eventSource": "kms",
"userAgent": "lambda.amazonaws.com",
"sourceIPAddress": "lambda.amazonaws.com",
"resources": [
{
"type": "AWS::KMS::Key",
"ARN": "arn:aws:kms:REGION:ACCOUNT:key/SOME-KEY-ID",
"accountId": "ACCOUNT"
}
],
"recipientAccountId": "ACCOUNT"
}
}
Any news about this issue ? If no one has looked at it I could probably try my luck at fixing it.
I got this same problem on my production app. According to this similar issue, a redeploy should be enough. I'm going to try this.
I just hit this issue also. Had to do a terraform destroy
followed by a terraform apply
to resolve.
Hi folks 👋 Sorry you are running into this strange behavior.
The maintainers here are not sure what the right action should be here given the vastly different experiences folks are having. Is documenting the potential for this odd behavior in the role
argument for the aws_lambda_function
resource documentation enough? We also could automatically trigger a code publish if role
is updated. The caveat there is that we could only publish the function again if the practitioner enabled the publish
argument.
Suggestions welcome, thanks!
Thank you for taking the time to consider this.
I think updating the documentation to explicitly mention this issue might be good. Would it also be possible to recommend including the IAM role name in the source_code_hash
? I'm not sure if a function update is enough to fix the issue though, but something like this. Is that what you mean by publish
?
source_code_hash = "${base64sha256(file("build.zip"))}-${aws_iam_role.myrole.name}"
Seems to be the same issue:
https://github.com/serverless/examples/issues/279
I also just ran into it. I'm using TF 0.13.1. Destroying and applying doesn't solve it.
That worked: https://github.com/terraform-providers/terraform-provider-aws/issues/6352#issuecomment-554359665
This is the code that causes the issue when testing the lambda. The problem started after introducing the Logging-policies and associating them with the Lambda's role.
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
}
}
provider "aws" {
profile = "default"
region = "us-west-2"
}
resource "aws_s3_bucket" "project_project_bucket" {
bucket = "project-project-bucket-g3dg6gf4fddk"
acl = "private"
}
resource "aws_iam_role" "lambda_project_etl" {
name = "iam_role_project_etl"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
# See also the following AWS managed policy: AWSLambdaBasicExecutionRole
resource "aws_iam_policy" "lambda_project_etl" {
name = "lambda_logging"
path = "/"
description = "IAM policy for logging from a lambda"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*",
"Effect": "Allow"
}
]
}
EOF
}
resource "aws_iam_role_policy_attachment" "lambda_logs" {
role = aws_iam_role.lambda_project_etl.name
policy_arn = aws_iam_policy.lambda_project_etl.arn
}
resource "aws_lambda_layer_version" "lambda_layer_project_etl" {
filename = "dist/lambda_layer_project_etl.zip"
layer_name = "lambda_layer_project_etl"
compatible_runtimes = ["python3.8"]
}
resource "aws_lambda_function" "lambda_project_etl" {
filename = "dist/lambda_project_etl.zip"
function_name = "lambda_import_main_stories_by_day"
role = aws_iam_role.lambda_project_etl.arn
handler = "lambda_import_main_stories_by_day.main"
source_code_hash = filebase64sha256("dist/lambda_project_etl.zip")
runtime = "python3.8"
layers = [aws_lambda_layer_version.lambda_layer_project_etl.arn]
environment {
variables = {
foo = "bar"
}
}
}
Seeing something similar, trying execute a aws lambda function:
"Calling the invoke API action failed with this message: Lambda was unable to decrypt the environment variables because KMS access was denied. Please check the function's KMS key settings. KMS Exception: UnrecognizedClientExceptionKMS Message: The security token included in the request is invalid."
Would this not be as simple as adding the role for the lambda to the depends_on attribute in the Lambda, so you make sure that the Role is created before the lambda?
had the same issue here. performing taint
on the problem lambda resource and replacing/recreating it via apply
seems to have solved the issue.
Difficult to find that this was a problem initially...
Initially I updated an unrelated resource, where the lambda related policies needed to be replaced/recreated.
When retesting after the initial update, I got the following cloudfront error:
500 {"message":null}
'X-Cache': 'Error from cloudfront'
The lambda function was showing errors, but I didnt' get any log output that would help debug the issue.
Hi, there is a second workaround to change lambda role to different one and go back to the original lambda role (I guess AWS update something behind the scene). More info: https://github.com/serverless/examples/issues/279#issuecomment-420387109
It has been over a year since anyone commented on this issue. I will be working to repro this and closing it if AWS and/or AWS provider changes have fixed the problem in the interim. Please let us know if you continue to face problems with this!
This is a great explanation from Paul Allen on the problem:
When you provide environment variables to a Lambda function, they're encrypted using a KMS key. Either a customer-managed key that you provide or an AWS managed default key (with the alias aws/lambda). When environment variables are first defined, if the default key is used then Lambda creates a grant on that key letting the execution role use it for decrypting the environment variables.
But, if that role is deleted and then re-created, the grant is no longer valid! This is the same as other resource-based policies when the principal is removed but this is special because we never actually explicitly created that grant ourselves. This means the function will start failing for no apparent reason.
I was able to reproduce this problem with this configuration:
data "aws_partition" "current" {}
resource "aws_iam_role" "test" {
name = "roletna"
managed_policy_arns = ["arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"]
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole",
Principal = {
Service = "lambda.${data.aws_partition.current.dns_suffix}",
}
Effect = "Allow"
}]
})
}
data "archive_file" "lambdazip" {
type = "zip"
output_path = "lambda.zip"
source_content = "def handler(event, context):\n\tpass\n"
source_content_filename = "lambda.zip"
}
resource "aws_lambda_function" "test" {
function_name = "dicvojid"
role = aws_iam_role.test.arn
handler = "index.handler"
runtime = "python3.9"
filename = data.archive_file.lambdazip.output_path
environment {
variables = {
foo = "bar"
}
}
}
Check function, then delete, and recreate role:
% aws lambda invoke \
> --function-name dicvojid \
> outfile
{
"StatusCode": 200,
"FunctionError": "Unhandled",
"ExecutedVersion": "$LATEST"
}
% aws iam detach-role-policy \
--role-name roletna \
--policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
% aws iam delete-role --role-name roletna
% terraform apply
% aws lambda invoke \
--function-name dicvojid \
outfile
An error occurred (KMSAccessDeniedException) when calling the Invoke operation (reached max retries: 2): Lambda was unable to decrypt the environment variables because KMS access was denied. Please check the function's KMS key settings. KMS Exception: AccessDeniedExceptionKMS Message: The ciphertext refers to a customer master key that does not exist, does not exist in this region, or you are not allowed to access.
We will not fix this issue except with documentation updates. We won't fix this with provider code changes for these reasons:
- The resources are acting as expected, managing what they are supposed to manage.
- AWS Lambda manages a grant on the KMS key to the function's IAM role that Terraform does not directly manage.
- The invocation error arises in the
aws_lambda_invocation
resource or data source but seamlessly fixing the problem would require performing management on a Lambda function, which should taken place in theaws_lambda_function
resource. - This should not typically be an on-going issue practitioners run into as a normal part of operations but something that occurs when the IAM role is inadvertently or mistakenly recreated.
- There are painless fixes to the problem: reassigning the function's role to another role and back to the recreated role, or tainting and recreating the function.
Thank you for your time and input on this! We apologize for the delay in clearing this up. Look for documentation additions.
This functionality has been released in v4.58.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.
For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.