aws-cdk
aws-cdk copied to clipboard
feat(lambda): allow to delete log group of a lambda function when it is removed
closes #21804
Now we can set autoDeleteLogGroup: true
to delete an associated Lambda log automatically.
new lambda.Function(stack, 'AutoDeleteLogGroup', {
code: new lambda.InlineCode('exports.handler = (event) => console.log(JSON.stringify(event));'),
handler: 'index.handler',
runtime: lambda.Runtime.NODEJS_14_X,
autoDeleteLogGroup: true,
});
Also I fixed an issue of logRetention when there are multiple LogRetention
resources. To be short the backing Lambda function is shared with all the LogRetention (similar to SingletonLambda) so we need to add an IAM policy every time when a LogRetention with RemovalPolicy=destory
is created.
Because without this fix the integration tests added in this PR won't work I'm packing those two fixes in this PR. Please let me know if I should submit two separate PRs :)
Consideration on property removal
If a user remove autoDeleteLogGroup
property as below, the logRetention resource will be removed from a synthesized template, and it will result in the removal of log group itself. This implicit behavior should be clearly warned in the doc.
new lambda.Function(stack, 'AutoDeleteLogGroup', {
code: new lambda.InlineCode('exports.handler = (event) => console.log(JSON.stringify(event));'),
handler: 'index.handler',
runtime: lambda.Runtime.NODEJS_14_X,
- autoDeleteLogGroup: true,
});
There was an issue regarding s3 Bucket's autoDeleteObjects
feature: #16603. We might need to implement a similar approach. It might be difficult since we are not directly managing LogGroup in CFn though.
All Submissions:
- [X] Have you followed the guidelines in our Contributing guide?
Adding new Unconventional Dependencies:
- [ ] This PR adds new unconventional dependencies following the process described here
New Features
- [X] Have you added the new feature to an integration test?
- [X] Did you use
yarn integ
to deploy the infrastructure and generate the snapshot (i.e.yarn integ
without--dry-run
)?
- [X] Did you use
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license
Hi @TheRealAmazonKendra I really appreciate your review and effort you expended on my PR! So let me summarize the discussion so far.
I assume there are currently three possible options to be considered:
// #1
// Pros: Less typing. Directly expressing developer's intent.
// Cons: Due to more abstraction, the API becomes different from LogRetention construct.
new lambda.Function(stack, 'AutoDeleteLogGroup', {
code: new lambda.InlineCode('exports.handler = (event) => console.log(JSON.stringify(event));'),
handler: 'index.handler',
runtime: lambda.Runtime.NODEJS_14_X,
autoDeleteLogGroup: true,
});
// #2
// Pros: Similar API to LogRetention construct.
// Cons: * Requires more typing. We have to import aws-cdk-lib.RemovalPolicy, and the property name seems a little redundant.
// * We can also pass RemovalPolicy.SNAPSHOT although it means nothing here, which is unnecessarily confusing.
// * It may look like removal policy of Lambda function itself, which is also misleading.
new lambda.Function(stack, 'AutoDeleteLogGroup', {
code: new lambda.InlineCode('exports.handler = (event) => console.log(JSON.stringify(event));'),
handler: 'index.handler',
runtime: lambda.Runtime.NODEJS_14_X,
logRetentionRemovalPolicy: RemovalPolicy.DESTROY,
});
// #3
// If we go with this approach we need much refactor work. I guess we can ignore this approach for now.
new lambda.Function(stack, 'AutoDeleteLogGroup', {
code: new lambda.InlineCode('exports.handler = (event) => console.log(JSON.stringify(event));'),
handler: 'index.handler',
runtime: lambda.Runtime.NODEJS_14_X,
logRetentionProps: {
// also move other logRetention* props here
removalPolicy: RemovalPolicy.DESTROY,
},
});
I agree with API consistency is important, and I somewhat agree with that we need to be consistent with props of internal constructs. But LogRetentionProps
and FunctionPros
are in different abstraction layers. Developers do not necessarily have to know about implementation detail of LogRetention
when they want to automatically delete log groups. To me using RemovalPolicy here seems kind of leaky abstraction where we can achieve simpler API.
What we also need to be care about is the consistency between constructs that are using LogRetention
. I did some study about this and found there are now five constructs using LogRetention
: AppSync, ChatBot, DocumentDB, RDS, and Lambda. None of them currently have implemented log removal feature yet, so I believe we don’t have to care about consistency in terms of this point, and we can design the API from scratch. Also as you can see, they do not currently have consistent API with each other at all anyway:(
To be honest after the release of this feature I’ll be specifying this option every time I create a Lambda function. And that's why I’m inclined to keep this API as simple as possible... I'd like to avoid to specify cdk.RemovalPolicy repetitively where a single boolean is enough. As a side note, I asked my colleagues about this today and most of them prefer #1 over #2 due to its simplicity.
That being said I'm willing to follow your final decision. Considering all of the above, how do you think about #1 and #2? And sorry for writing such a long message with my poor English. Let me know if there are any sentences that does not make sense! Thanks.
@TheRealAmazonKendra According to the design guidelines, flat props structure is prefferred over nested structure. (doc). I guess it partially explains why most of the log properties are currently designed as such instead of #3 approach.
If we are to fully replace the existing contract design, and if the contract consistency is a top priority, how about adding a new interface to set log retention?
e.g.
interface ILoggable {
setLogRetention(props: LogRetentionProps): LogRetention;
logGroup: ILogGroup;
}
class Function implements ILoggable {
setLogRetention(props: LogRetentionProps) {
// ...
}
}
const func = new lambda.Function(stack, 'AutoDeleteLogGroup', {
code: new lambda.InlineCode('exports.handler = (event) => console.log(JSON.stringify(event));'),
handler: 'index.handler',
runtime: lambda.Runtime.NODEJS_14_X,
});
func.setLogRetention({ removalPolicy: RemovalPolicy.DESTROY })
As long as they implement this interface, the API will be ensured to be consistent between constructs.
However I'm not sure if the above approach will work well with some constructs such as AppSync.GraphqlApi, which seemingly want to provide optimized experience by defining their own props (e.g. LogConfig). I guess we also need to discuss with people who are developing the constructs?
Also I think we should, ideally, define a dedicated props interface for setLogRetention
instead of re-using the existing LogRetentionProps
, because setting logGroupName
here does not make sense at all even if logGroupName
is optional. The log group name should always be computed from the resource properties (e.g. function name.) So we need something like Omit<LogRetentionProps, 'logGroupName'>
instead (of course we cannot use generics with jsii so it should be manually defined.)
Hi @TheRealAmazonKendra As I noted on the PR description, I found there will be an issue similar to #16603 in LogRetention construct (i.e. users may remove a log group unintentionally.) Is it something we must resolve before adding this PR's feature? If so I'll work on it before this PR.
This PR has been in the CHANGES REQUESTED state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.
@tmokmss Please could you advise on the current state of this PR (e.g. active, blocked or abandoned)? It looks very useful to what my team is currently trying to achieve if it can be progressed/merged.
Hi @stec00 From my perspective there are still two remaining problems for this issue:
- what API will look like? (ref)
- a log group can be accidentally deleted, as described in the PR body; do we need to handle this issue?
I'm waiting comments from @TheRealAmazonKendra, and your comments on API design is also welcome.
Thanks @tmokmss . Not sure I'm qualified to comment on all the detail here as it looks like it requires knowledge of patterns elsewhere that I don't currently have. But am very keen for it to be progressed as we currently have a lot of old log groups remaining from Lambdas whose CloudFormation stacks have long since been deleted. (We can of course clean these up manually but the problem will keep resurfacing). @TheRealAmazonKendra are you able to help steer this please - or if not is there someone else who could re-review?
This seems to be stagnating now. Would be a shame for the work done so far to go to waste, as it's providing useful missing functionality - in my case I have a stack that's in production and need to change the template so that future Lambda log groups get auto-deleted when their stacks are deleted. CC: @TheRealAmazonKendra @tmokmss
@stec00 A workaround would be something like this (not ideal of course):
// In bin/xxxx.ts
const app = new cdk.App();
new YourStack(app, "YourStack");
class LogRemover implements IAspect {
public visit(construct: IConstruct): void {
if (construct.node.id == 'LogRetentionaae0aa3c5b4d4f87b02d85b201efdd8a') {
const role = construct.node.findChild('ServiceRole') as Role;
// Add required IAM policy to the custom resource handler for log retention
role.addToPolicy(
new PolicyStatement({
actions: ['logs:DeleteLogGroup'],
resources: ['*'],
}),
);
} else if (construct instanceof LogRetention) {
// set removalPolicy to all of the log retention resources
const custom = construct.node.findChild("Resource") as CfnResource;
custom.addOverride("Properties.RemovalPolicy", "destroy");
}
}
}
// register this aspect to your CDK app
Aspects.of(app).add(new LogRemover());
You also need to add log retention to each of your lambda function:
const handler = new lambda.Function(this, 'Handler', {
// ...
// you need this
logRetention: RetentionDays.INFINITE,
});
You should be super careful not to remove required log groups though. For example, once you deploy the above and you just removed logRetention: RetentionDays.INFINITE
not Lambda function itself, your log group will be deleted as described in the body of this PR.
This PR has been in the MERGE CONFLICTS state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.
AWS CodeBuild CI Report
- CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
- Commit ID: fff7bcd468b64ac9348d6d047e44efb41b6ba113
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Ping - We're hoping to avoid implementing a hacky workaround for the most obvious non-working solution.
This PR has been in the MERGE CONFLICTS state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.
This PR has been in the MERGE CONFLICTS state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.
This PR has been in the MERGE CONFLICTS state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.
This PR cannot be merged because it has conflicts. Please resolve them. The PR will be considered stale and closed if it remains in an unmergeable state.
@tmokmss Apologies for the delay. Let's wrap up the open questions:
- what API will look like? (ref)
The API for logRetention
is a huge mess and desperately needs to be refactored. In the meantime #2 is the way to go as per the design docs. It has a flat structure and uses RemovalPolicy
. S3 Bucket actually is the only (documented) special case, so it kind of deserves a different property. Although arguably we could have created a RemovalPolicy.DESTROY_WITH_CONTENTS
at the time.
- a log group can be accidentally deleted, as described in the PR body; do we need to handle this issue?
Yes, but that's already implemented in the CR anyway.
I also have one more thing I'd like to clarify:
If a user remove autoDeleteLogGroup property as below, the logRetention resource will be removed from a synthesized template, and it will result in the removal of log group itself. This implicit behavior should be clearly warned in the doc.
I don't see how this is happening with the code you have on the PR. Is this still the case, if so can you help me understand how?
Hi @mrgrain, thanks for the review!
I don't see how this is happening with the code you have on the PR. Is this still the case, if so can you help me understand how?
Sure. Let's say we have the following template deployed.
new lambda.Function(stack, 'Function', {
code: new lambda.InlineCode('exports.handler = (event) => console.log(JSON.stringify(event));'),
handler: 'index.handler',
runtime: lambda.Runtime.NODEJS_14_X,
autoDeleteLogGroup: true,
});
The CFn template for this code looks like this:
"FunctionF74A7E0D": {
"Type": "AWS::Lambda::Function",
"Properties": { ...
}
},
"FunctionLogRetentionAC501FFB": {
"Type": "Custom::LogRetention",
"Properties": {
...
"RemovalPolicy": "destroy"
}
}
Now if we remove the autoDeleteLogGroup property:
new lambda.Function(stack, 'Function', {
code: new lambda.InlineCode('exports.handler = (event) => console.log(JSON.stringify(event));'),
handler: 'index.handler',
runtime: lambda.Runtime.NODEJS_14_X,
- autoDeleteLogGroup: true,
});
the CFn template will look like this:
"FunctionF74A7E0D": {
"Type": "AWS::Lambda::Function",
"Properties": { ...
}
},
- "FunctionLogRetentionAC501FFB": {
- "Type": "Custom::LogRetention",
- "Properties": {
- ...
- "RemovalPolicy": "destroy"
- }
- }
Note that only the log retention resource is deleted, and the function remains as-is.
Here is where the unexpected behavior comes in. When a Custom::LogRetention
with "RemovalPolicy": "destroy"
is removed, it will always remove the log group, even though the log group is still used by the function.
https://github.com/aws/aws-cdk/blob/6c588da2c6b1c2f5a53fb9151c84c7298ce2427e/packages/aws-cdk-lib/aws-logs/lib/log-retention-provider/index.ts#L156-L160
Actually there is no way for the custom resource to know if the corresponding function is removed or not. I thought we could query the Lambda service if the function exists or not, but that is also difficult, because of the dependency between a LogRetention and a function. A LogRetention depends on a corresponding function (to get the function name), so it must be deleted before the function.
https://github.com/aws/aws-cdk/blob/7c7697e4d06f1d0be73da50509a4c73545a1ea87/packages/aws-cdk-lib/aws-lambda/lib/function.ts#L868-L869
I still has not found a good solution for this problem, but any suggestion is welcome! :) I think the easiest approach is just to accept this behavior as expected, and document details about this.
@tmokmss Thanks for the explanation. I understand the problem now. I think the issue comes down to creating a false equality between the LogRetention
custom resource and the LogGroup
that the Lambda function uses. Before #21113 the CR did not concern itself with a the lifecycle of the log group. It was a simple fix to overwrite the retention. Actually https://github.com/aws/aws-cdk/pull/26049 has a similar issue in this regard.
Not quite sure what the solution would be for all of these... 🤔
@mrgrain Another solution would be to always define a LogRetention custom resource regardless of a function's properties. Then even if we remove those properties, the log retention resource and the log group will still be retained.
I hesitated to do this because it would require updates to many many existing integ tests. But If other features are also facing this problem, it might be reasonable to change the current behavior.
Been thinking about this all morning. It's fairly easy to implement a variation of the fix that S3's autoDeleteObjects
has. But only if the parent resource (here: a Lambda Function) is known. It get's complicated for the general case because each resource uses a slightly different tagging interface. 😕
@tmokmss Thanks again for this PR. I'm drafting this for now. We have to solve the deletion issue first. Same as here: https://github.com/aws/aws-cdk/pull/26049
I have some ideas for it though.
@mrgrain ok, thanks! I'm looking forward to that:)
I'm planning to create a generic DependentLogGroup
(or similar) that actually supports all log group properties.
To avoid the deletion of the log group, we need to tag the parent (e.g. Lambda Function) with a special tag. The only blocker is that it's not really possible to read tags from generic parent resource. I think we will have to configure them manually.
Nice! I just found issue #16756 and the discussions there appear to cover most related considerations. After reading them, I'm now convinced that tagging a parent resource is (maybe only) a viable solution. (I didn't know the special ordering of tags.)
Looking up the current deployed template (reverted in d220b949) is interesting but I guess it will also requires special handling for each parent resource type anyway.
This PR has been in the MERGE CONFLICTS state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.
AWS CodeBuild CI Report
- CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
- Commit ID: 5d74c13115c31ccde9f7d291f4fe8d43f3628a82
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository