cluster-logging-operator
cluster-logging-operator copied to clipboard
LOG 2207: Add policies based log flow control in CLO
Description
"Flow control" refers to how the logging system behaves when logs are produced faster than they can be collected or forwarded. This PR enhances the API to let cluster administrators limit logging rates, or ignore some logs entirely. Logs may still be lost if the collector cannot keep up, but administrators have more control over what is lost, and more predictability of log rates.
Control log rates and overflow policy at two points in the log forwarder:
- Output: controlling the flow rate per destination to selected outputs.
- Limit the rate of outbound logs to match output network and storage capacity.
- Controls aggregated (per-destination) output rate.
- Input: Controlling log flow rates per container from selected containers.
- Limit the rate of log collection for selected groups of containers per-container.
- Controls individual (per-container) collection throttling.
Note:
- Limit is applied as number of records, not bytes
- This enhancement does not include a
block
policy, which would back-pressure containers that exceed rate limits, forcing them to block on stout/std err and slow down to keep within the rate limit.
Example: Set a per-container limit for containers with certain labels
inputs:
- application:
selector:
matchLabels: { importance: low }
limitPerContainer:
policy: drop
maxRecordsPerSecond: 10
- application:
selector:
matchLabels: { importance: high }
limitPerContainer:
policy: drop
maxRecordsPerSecond: 1000
/cc @jcantrill @vimalk78 @eranra /assign @alanconway
Links
- Depending on PR(s): NA
- Bugzilla: NA
- Github issue: NA
- JIRA: https://issues.redhat.com/browse/LOG-2207
- Enhancement proposal: https://issues.redhat.com/browse/LOG-1043
cc @alanconway @vimalk78
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: jcantrill, Pranjal-Gupta2
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [jcantrill]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
/hold
@Pranjal-Gupta2: The following tests failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/functional | 95fe2b20bdfeaa3c455a8aa1f918ff2707d2ca20 | link | true | /test functional |
ci/prow/e2e-ocp-next | 95fe2b20bdfeaa3c455a8aa1f918ff2707d2ca20 | link | false | /test e2e-ocp-next |
ci/prow/e2e | 95fe2b20bdfeaa3c455a8aa1f918ff2707d2ca20 | link | true | /test e2e |
ci/prow/e2e-claim-aws | 95fe2b20bdfeaa3c455a8aa1f918ff2707d2ca20 | link | false | /test e2e-claim-aws |
ci/prow/unit | 95fe2b20bdfeaa3c455a8aa1f918ff2707d2ca20 | link | true | /test unit |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
/test functional
/retest
/retest
/hold cancel
/hold
@Pranjal-Gupta2 make sure to work with @syedriko if needed to ensure the vector changes are included with the image we are using for the 5.8 release
/test e2e-target
/test functional
/test e2e-target
/test functional
/test e2e
@jcantrill: The specified target(s) for /test
were not found.
The following commands are available to trigger required jobs:
-
/test ci-index-cluster-logging-operator-bundle
-
/test e2e-target
-
/test functional
-
/test images
-
/test lint
-
/test unit
The following commands are available to trigger optional jobs:
-
/test e2e-ocp-target-minus-one
-
/test e2e-ocp-target-minus-two
-
/test functional-target
Use /test all
to run all jobs.
In response to this:
/test e2e
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/retest
merged https://github.com/openshift/cluster-logging-operator/pull/2055 to hopfully resolve one flake
@Pranjal-Gupta2 you may need to rebase on https://github.com/openshift/cluster-logging-operator/pull/2055 if you have not to take advantage of the changes. I'm not certain they fix the issue but in the PR tests based without duplicating the issue at hand
/retest
@Pranjal-Gupta2: The following tests failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/e2e-ocp-next | 95fe2b20bdfeaa3c455a8aa1f918ff2707d2ca20 | link | false | /test e2e-ocp-next |
ci/prow/e2e | 95fe2b20bdfeaa3c455a8aa1f918ff2707d2ca20 | link | true | /test e2e |
ci/prow/e2e-claim-aws | 95fe2b20bdfeaa3c455a8aa1f918ff2707d2ca20 | link | false | /test e2e-claim-aws |
ci/prow/ci-bundle-cluster-logging-operator-bundle | 381d32668d2a0948d6edff356c79d239b98072dd | link | true | /test ci-bundle-cluster-logging-operator-bundle |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
/retest
/hold /lgtm
/hold cancel