origin
origin copied to clipboard
MON-1157: Revive prometheus metrics best practices
@jan--f: This pull request references MON-1157 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.16.0" version, but no target version was set.
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
/cc @dgrisonnet
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: jan--f
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~test/extended/prometheus/OWNERS~~ [jan--f]
- ~~test/extended/util/prometheus/OWNERS~~ [jan--f]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Fwiw this is for now just an attempt to revive code that Damien already wrote. I have not checked if revived code make sense in the current test setup. This might need significantly more work.
The initial concern with that test was that it might create disruption in CI if not done properly. I don't really know what would be the safest way to introduce it, but at least you would need to update the exception list and then maybe make it flaky for a bit, or notify TRT to have them watch the test and revert in case something bad happens.
Another concern was that e2e tests might not catch all the metrics & labels since it depends on whether the scenario that triggers the generation of a particular timeserie is tested or not. But having at least some of the metrics checked is already better than nothing, so I don't think it make sense anymore.
@jan--f: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| ci/prow/e2e-gcp-ovn-rt-upgrade | c006ed9d393a090e61feba70159bb5fbd410da1f | link | false | /test e2e-gcp-ovn-rt-upgrade |
| ci/prow/verify | c006ed9d393a090e61feba70159bb5fbd410da1f | link | true | /test verify |
| ci/prow/e2e-gcp-csi | c006ed9d393a090e61feba70159bb5fbd410da1f | link | false | /test e2e-gcp-csi |
| ci/prow/e2e-metal-ipi-sdn | c006ed9d393a090e61feba70159bb5fbd410da1f | link | false | /test e2e-metal-ipi-sdn |
| ci/prow/e2e-agnostic-ovn-cmd | c006ed9d393a090e61feba70159bb5fbd410da1f | link | false | /test e2e-agnostic-ovn-cmd |
| ci/prow/e2e-gcp-ovn | c006ed9d393a090e61feba70159bb5fbd410da1f | link | true | /test e2e-gcp-ovn |
| ci/prow/e2e-aws-ovn-fips | c006ed9d393a090e61feba70159bb5fbd410da1f | link | true | /test e2e-aws-ovn-fips |
| ci/prow/e2e-aws-ovn-single-node | c006ed9d393a090e61feba70159bb5fbd410da1f | link | false | /test e2e-aws-ovn-single-node |
| ci/prow/e2e-metal-ipi-ovn-ipv6 | c006ed9d393a090e61feba70159bb5fbd410da1f | link | true | /test e2e-metal-ipi-ovn-ipv6 |
| ci/prow/e2e-aws-ovn-cgroupsv2 | c006ed9d393a090e61feba70159bb5fbd410da1f | link | false | /test e2e-aws-ovn-cgroupsv2 |
| ci/prow/e2e-aws-ovn-single-node-serial | c006ed9d393a090e61feba70159bb5fbd410da1f | link | false | /test e2e-aws-ovn-single-node-serial |
| ci/prow/e2e-openstack-ovn | c006ed9d393a090e61feba70159bb5fbd410da1f | link | false | /test e2e-openstack-ovn |
| ci/prow/e2e-aws-ovn-single-node-upgrade | c006ed9d393a090e61feba70159bb5fbd410da1f | link | false | /test e2e-aws-ovn-single-node-upgrade |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten /remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.
/close
@openshift-bot: Closed this PR.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting
/reopen. Mark the issue as fresh by commenting/remove-lifecycle rotten. Exclude this issue from closing again by commenting/lifecycle frozen./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.