cluster-logging-operator
cluster-logging-operator copied to clipboard
Add retention period config for AWS CloudWatch Logs
Description
This PR is adding functionality for CloudWatch Forwarder. The provided option allows defining a retention period for newly created CloudWatch Log Groups.
We use groupBy: namespaceName
option in CloudWatch ClusterLogForwarder. On top of that, our cluster has a lot of new namespace creations. With that workload and configuration, many new AWS CloudWatch Log Groups will be created. By default log group logs will never expire. In our case, we need an automated process to set the retention period for newly created groups. It's best to have a solution where we can configure on the collector side what will be the retention period for new log groups.
/cc @jcantrill /assign @vimalk78
Considerations
- If we have documented somewhere a list of required AWS IAM Policies for CloudWatch ClusterLogForwarder. We should also explain that for the retention period option, it's required to add
logs:PutRetentionPolicy
.
Links
- JIRA: https://issues.redhat.com/browse/ROX-12486
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: mtodor
Once this PR has been reviewed and has the lgtm label, please assign jcantrill for approval by writing /assign @jcantrill
in a comment. For more information see:The Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
@alanconway please review. I'm not sure how I feel about adding something like this to CLF
/hold
I've checked the vector config, and I didn't find an option for it. I'm not sure if this is the right place? https://vector.dev/docs/reference/configuration/sinks/aws_cloudwatch_logs/
Normally retention is defined at the data store, not by the collector. Can it be set at the cloudwatch end? What's the reason for defining it at the collector in this case? The PR looks good, but I want to make sure there's a good reason for doing this at the forwarder.
@alanconway Unfortunately, there is no option to set a default retention period for new log groups in AWS. It's possible to set it only on a single log group.
The problem that we are facing is the following. Our fleet manager service creates a lot of new namespaces. Every new instance of managed service is encapsulated in a namespace. When collecting logs, we use groupBy namespaceName
to have all logs of one service instance in one log group. The main problem is that we will get a new log group with each new dynamically created service. And log groups have naming like this: rhacs-stage-123-cxabc.rhacs-nwiyogywn2rkzjgzogjl
. The last hash is the dynamic part.
Because of that, we cannot set up the retention period beforehand. It's simply because of the nature of dynamic namespace creation. I have also provided some information in the description of this PR.
Another use case is for development clusters. We often create short-living (staging) OpenShift clusters that are available for 2-3 days and automatically cleaned up. We also have logging enabled there. Several new log groups will be created with each new cluster (audit, infrastructure, operator namespace, and namespaces for tested service instances).
I'm not familiar with the project. Maybe the forwarder is not the best place to put this option. Perhaps it's better to make it part of FluentdForwarderSpec
in ClusterLogging
. In this case, it will be better encapsulated as part of fluentd config. I'm open to suggestions.
I think that also the wider OpenShift community can benefit from this functionality.
And log groups have naming like this: rhacs-stage-123-cxabc.rhacs-nwiyogywn2rkzjgzogjl. The last hash is the dynamic part.
Uniqueness in an OpenShift cluster is controlled via a name, not UUID. We added UUID in versions of logging long ago as a mechanism to try and distinguish between same name NS for the create/delete/create workflow. We are no longer doing this. I would argue that maybe we should not be adding the UUID at all as this information really is not useful ore discoverable after a NS is deleted. Maybe we should remove it which seems like it may resolve your issue. I am really less inclined to add this to CLF given it is a "storage" retention feature. This seems like something you should control on that side and not at the collection point. Also, if vector does not also support this capability I would be even less in favor of adding this option
@jcantrill I'm not sure if this relates to namespace UUIDs - I think the namespace names have a random suffix. @mtodor is that correct?
@mtodor If cloudwatch doesn't provide a way to set retention on the storage side, then this seems like a reasonable addition to me. It means that cloudwatch storage is at the mercy of cluster admins to get retention right, which doens't seem ideal but I'll take your word for it that this is required.
/lgtm
@jcantrill I'm not sure if this relates to namespace UUIDs - I think the namespace names have a random suffix. @mtodor is that correct?
Namespace name never has a random suffix unless a user gives it such. Pod names have a random suffix if they are derived from a replicaset. I'm fairly certain we added namespace uuid as a mechanism to mimic what we did with Elasticsearch to try and guarantee uniqueness across namespaces for the delete/create scenario where the NS is recreated with the same name. If this issue can be moved to the storage side by allowing user's to "opt-out" of appending NS UUID to the group that seems like a better choice. Retention is part of storage, not collection
@mtodor If cloudwatch doesn't provide a way to set retention on the storage side, then this seems like a reasonable addition to me. It means that cloudwatch storage is at the mercy of cluster admins to get retention right, which doens't seem ideal but I'll take your word for it that this is required.
@jcantrill I'm not sure if this relates to namespace UUIDs - I think the namespace names have a random suffix. @mtodor is that correct?
Namespace name never has a random suffix unless a user gives it such.
Agreed - and as I read it, the user is adding a random hash to namespace names. @mtodor can you confirm? The random hash is not the problem here - any system that creates unpredictable new namespaces, whether they use random hashes or not, is going to have the same problem - all possible namespaces can't be known in advance.
If this issue can be moved to the storage side by allowing user's to "opt-out" of appending NS UUID to the group that seems like a better choice. Retention is part of storage, not collection
Agreed again, but it appears cloudwatch can't define "wildcard" retentions for unpredictable namespaces - @mtodor can you confirm that is the case? If it is the case then client-side retention settings are the only option.
Need confirmation from @mtodor to proceed...
@alanconway You are correct. In AWS, it's only possible to set a retention period for a single CloudWatch Log Group. And yes, we don't know all namespaces in advance. Along with that, we cannot pre-create all AWS CloudWatch Log Groups.
As I have already explained before. Independently from namespaces, the same problem emerges when short-living clusters are used (i.e., feature branch testing or development clusters). Because log group names are prefixed by the cluster name.
From my understanding, the only solution with the logging operator where we can define the retention period upfront is to use groupPrefix
. In that case, we would have a single group for all clusters and all namespaces inside them per log type (3 in total). But using logs structured like that would not be so convenient.
I'm OK with this, given it's a cloudwatch-specific extension for a cloudwatch-specific problem. Leaving the /hold so @jcantrill can chime in if he has any remaining concerns, otherwise @jcantrill can you remove the hold?
I'm OK with this, given it's a cloudwatch-specific extension for a cloudwatch-specific problem. Leaving the /hold so @jcantrill can chime in if he has any remaining concerns, otherwise @jcantrill can you remove the hold?
I don't see this as an available option for vector which is the path we are going https://vector.dev/docs/reference/configuration/sinks/aws_cloudwatch_logs . IMO we should add this feature to the upstream api before introducing it into our CLF otherwise we have yet anther bit that is not compatible
@jcantrill good point, I was assuming it was available in vector also.
@mtodor Here's an alternate solution that I think may work better in the long run, let me know what you think:
- Modify the forwarder: Extend cloudwatch.groupBy to allow kubernetes.label.* as a value
- In your clusters, add pod labels something like
logGroup: shortTerm
or whatever. - Now your logs will be routed to predictable groups (shortTerm, longTerm etc.) and you can set retention appropriately.
This has several adavantages over setting the retention at the forwarder:
- Cluster admins can specify a rentention policy (label value) based on the volume/importancs of an application
- Actual retention periods are set centrally by cloudwatch admin. If the database gets overloaded its possible to reduce the retention periods (by reducing the retention for each policy) without going back to reconfigure each cluster. The relalative importance of the policies can be preserved within the central storage limits (e.g. reduce all the periods by 20%)
- It does not require the counter-intuitive "client sets retention" feature.
- It's consistent with how we have handled similar situations on other outputs.
- It is potentially useful for other cases where users want to control how logs are routed to cloudwatch groups for whatever reason other than retention.
Would that solve the problem? Sorry for delaying you excellent PR, but this is something we need to get right or it will bite us later as customers scale up or run into storage control issues.
@mtodor: all tests passed!
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
@mtodor any thoughts on https://github.com/openshift/cluster-logging-operator/pull/1645#issuecomment-1265634764 ?
@mtodor: The following test failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/e2e-target | 15ce50017984d1be2dd89b6bf9fddefb695c3393 | link | true | /test e2e-target |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
@alanconway @jcantrill, I owe you an update.
We have switched to groupBy: logType
with predefined groupPrefix
per cluster. With that configuration, we know we will have precisely 3 log groups. And we know their names. With that, we can pre-create LogGroups and set retention periods via Terraform.
One downside is the cost when we have some investigation of a problem because insight queries will run over a larger dataset. At this point, it's not a concern.
The positive sides are that number of log groups is reduced, which makes the overview in CloudWatch cleaner, and the usability of CloudWach Logs for investigations is simplified with just one group to look at.