openshift-docs
openshift-docs copied to clipboard
OSDOCS-9480: NetObserv Custom Metrics
Version(s):
Issue:
https://issues.redhat.com/browse/OSDOCS-9480 Link to docs preview:
Custom Metrics conceptual info: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-custom-metrics_metrics-dashboards-alerts
QE review:
- [ ] QE has approved this change.
Additional information:
@skrthomas: This pull request references OSDOCS-9480 which is a valid jira issue.
In response to this:
Version(s):
Issue:
https://issues.redhat.com/browse/OSDOCS-9480 Link to docs preview:
QE review:
- [ ] QE has approved this change.
Additional information:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
🤖 Wed May 29 02:11:31 - Prow CI generated the docs preview:
https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html
@skrthomas: This pull request references OSDOCS-9480 which is a valid jira issue.
In response to this:
Version(s):
Issue:
https://issues.redhat.com/browse/OSDOCS-9480 Link to docs preview:
Custom Metrics conceptual info: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-custom-metrics_metrics-dashboards-alerts
Configuring custom metrics: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/configuring-operator#network-observability-custom-metrics-id-unknown-ingress_network_observability
QE review:
- [ ] QE has approved this change.
Additional information:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
@jotak can you PTAL at this first draft?
@skrthomas FYI, I'm adding doc in our upstream Metrics.md here: https://github.com/netobserv/network-observability-operator/pull/609/files#diff-66471adf627be69953aa4a35b4c845a3ff885b746f0bd9ee9c3179fe38a15886
Might be useful for this PR too
@jotak Thanks for linking me to that upstream documentation. It helped me understand more how the predefined and custom metrics fit together, so I can better organize the documentation. Also thanks for those great examples in there. I added them in this documentation as well.
@skrthomas: This pull request references OSDOCS-9480 which is a valid jira issue.
In response to this:
Version(s):
Issue:
https://issues.redhat.com/browse/OSDOCS-9480 Link to docs preview:
Custom Metrics conceptual info: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-custom-metrics_metrics-dashboards-alerts
QE review:
- [ ] QE has approved this change.
Additional information:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
Just a few comments, other than that LGTM!
There's some duplicated text between sections "Predefined Metrics" and "Network Observability metrics":
The first sentence is repeated (I saw this using the preview link, not sure if it's up to date)
Also, I know adding links to github is generally avoided, but shouldn't we link to the samples directory that we have for custom metrics? ( https://github.com/netobserv/network-observability-operator/tree/main/config/samples/flowmetrics ) I think it will be useful for customers
@skrthomas I found something else to update: the Alert example in this section still uses the PrometheusRule
instead of AlertingRule
. When we made the change there I guess we forgot to do in this section as well.
The new YAML should be:
apiVersion: monitoring.openshift.io/v1
kind: AlertingRule
metadata:
name: netobserv-alerts
namespace: openshift-monitoring
spec:
groups:
- name: NetObservAlerts
rules:
- alert: NetObservIncomingBandwidth
annotations:
message: |-
{{ $labels.job }}: incoming traffic exceeding 10 MBps for 30s on {{ $labels.DstK8S_OwnerType }} {{ $labels.DstK8S_OwnerName }} ({{ $labels.DstK8S_Namespace }}).
summary: "High incoming traffic."
expr: sum(rate(netobserv_workload_ingress_bytes_total{SrcK8S_Namespace="openshift-ingress"}[1m])) by (job, DstK8S_Namespace, DstK8S_OwnerName, DstK8S_OwnerType) > 10000000
for: 30s
labels:
severity: warning
(just changing apiVersion
, kind
and namespace
)
@skrthomas I found something else to update: the Alert example in this section still uses the
PrometheusRule
instead ofAlertingRule
. When we made the change there I guess we forgot to do in this section as well.The new YAML should be:
apiVersion: monitoring.openshift.io/v1 kind: AlertingRule metadata: name: netobserv-alerts namespace: openshift-monitoring spec: groups: - name: NetObservAlerts rules: - alert: NetObservIncomingBandwidth annotations: message: |- {{ $labels.job }}: incoming traffic exceeding 10 MBps for 30s on {{ $labels.DstK8S_OwnerType }} {{ $labels.DstK8S_OwnerName }} ({{ $labels.DstK8S_Namespace }}). summary: "High incoming traffic." expr: sum(rate(netobserv_workload_ingress_bytes_total{SrcK8S_Namespace="openshift-ingress"}[1m])) by (job, DstK8S_Namespace, DstK8S_OwnerName, DstK8S_OwnerType) > 10000000 for: 30s labels: severity: warning
(just changing
apiVersion
,kind
andnamespace
)
Hmmm this is suspicious because I definitely made the change to the Creating alerts topic, which you can see here: https://github.com/openshift/openshift-docs/pull/74099/files#diff-8d0022591f4f4567d0cc2ff90110d33839696453336a9fd67a8d766ff79f3f48R22. And this change was merged to 4.14+ and the entire example was removed from 4.12, 4.13 cf https://github.com/openshift/openshift-docs/pull/74100.
When I look at the live doc it says AlertingRule. i think maybe what happened is I created the no-1.6
branch on April 9 I think because this Custom Metrics PR was the first PR I opened in that branch, and it says I opened it April 9. Then I see I merged those PRs to change AlertingRule on April 9. So when I created my no-1.6
branch, it basically copies openshift-docs main
branch at the time, and then it doesn't sync back up until I rebase everything in the no-1.6
branch with the main
branch, which I was doing at the end of the cycle. So let me try to balance my branches, and maybe that will fix this weirdness.
/retest
Also, I know adding links to github is generally avoided, but shouldn't we link to the samples directory that we have for custom metrics? ( https://github.com/netobserv/network-observability-operator/tree/main/config/samples/flowmetrics ) I think it will be useful for customers
I agree that all these samples are really helpful for customers, which is why in my initial draft I was wanting to include them all (although I can appreciate the maintenance you're concerned about). When I've asked about things like this in the past, the preference is to include them in the documentation itself rather than linking to GitHub. Our contrib guide has pretty clear directive not to include links to GitHub: https://github.com/openshift/openshift-docs/blob/main/contributing_to_docs/doc_guidelines.adoc#links-to-external-websites , and the exception is one I got approval for specifically for our API documentation.
@skrthomas ok, my bad for the AlertingRule I should have checked the actual doc which is fine, I guess yes it's just a matter of outdated branch so it shouldn't be an issue
For the github link I was expecting / fearing this answer :-) ok then let's just forget about it.
For the github link I was expecting / fearing this answer :-) ok then let's just forget about it.
@jotak I'm sorry :/ I will happily add back all the samples to the doc (it won't be that bad to just grab it from my previous commit), and set some sort of Jira ticket to help me remember to ask about any maintenance issues that may come up. The reason is because we don't want to send customers off the docs pages to GitHub and then rely on them to come back. The idea is its better to keep them in once place if possible so for that reason its expected we just put any upstream doc in the downstream doc if the question is whether or not to link.
There's some duplicated text between sections "Predefined Metrics" and "Network Observability metrics": I removed the duplicated text from the Network Observability metrics section. Its supposed to only be in Predefined Metrics.
@memodi when you have a chance, can you PTAnotherL at this PR? I addressed your comments.
@skrthomas: This pull request references OSDOCS-9480 which is a valid jira issue.
In response to this:
Version(s):
Issue:
https://issues.redhat.com/browse/OSDOCS-9480 Link to docs preview:
Custom Metrics conceptual info: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-custom-metrics_metrics-dashboards-alerts
Configuring custom metrics: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-configuring-custom-metrics_metrics-dashboards-alerts
Configuring custom charts: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-custom-charts-flowmetrics_metrics-dashboards-alerts
QE review:
- [ ] QE has approved this change.
Additional information:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
@skrthomas: This pull request references OSDOCS-9480 which is a valid jira issue.
In response to this:
Version(s):
Issue:
https://issues.redhat.com/browse/OSDOCS-9480 Link to docs preview:
Custom Metrics conceptual info: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-custom-metrics_metrics-dashboards-alerts
Configuring custom metrics: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-configuring-custom-metrics_metrics-dashboards-alerts
Configuring custom charts: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-custom-charts-flowmetrics_metrics-dashboards-alerts
QE review:
- [ ] QE has approved this change.
Additional information:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
@skrthomas: This pull request references OSDOCS-9480 which is a valid jira issue.
In response to this:
Merge to only the no-1.6 branch - no cherrypicks are required. This PR is part of an experiment for simplifying merges for asynchronous content, and I will open one PR against main to incorporate all of the Network Observability 1.6 content just before its GA
Version(s):
Issue:
https://issues.redhat.com/browse/OSDOCS-9480 Link to docs preview:
Custom Metrics conceptual info: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-custom-metrics_metrics-dashboards-alerts
Configuring custom metrics: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-configuring-custom-metrics_metrics-dashboards-alerts
Configuring custom charts: https://74412--ocpdocs-pr.netlify.app/openshift-enterprise/latest/observability/network_observability/metrics-alerts-dashboards.html#network-observability-custom-charts-flowmetrics_metrics-dashboards-alerts
QE review:
- [ ] QE has approved this change.
Additional information:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
@skrthomas: all tests passed!
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.