ingress-nginx Consistent prometheus metric names and documentation

What this PR does / why we need it:

In https://github.com/kubernetes/ingress-nginx/pull/8726 I've added a new metric and noticed some inconsistency in metric names:

_request_duration_seconds Histogram

The request processing time in milliseconds (affected by client speed)

nginx var: request_time
_response_duration_seconds Histogram

The time spent on receiving the response from the upstream server (affected by client speed)

nginx var: upstream_response_time
_header_duration_seconds Histogram ADDED

The time spent on receiving the first header from the upstream server

nginx var: upstream_header_time
_connect_duration_seconds Histogram ADDED

The time spent on establishing a connection with the upstream server

nginx var: upstream_connect_time
_response_size Histogram

The response length (including request line, header, and request body)

nginx var: bytes_sent
_request_size Histogram

The request length (including request line, header, and request body)

nginx var: request_length
_requests Counter

The total number of client requests
_bytes_sent Histogram DEPRECATED

The number of bytes sent to a client. Deprecated, use _response_size

nginx var: bytes_sent
_ingress_upstream_latency_seconds Summary DEPRECATED

Upstream service latency per Ingress. Deprecated, use _connect_duration_seconds

nginx var: upstream_connect_time
ingress_upstream_header_seconds Summary REMOVED, I've added it in the previous PR and it has not been released

I also added documentation for all metrics to close https://github.com/kubernetes/ingress-nginx/issues/2924

Types of changes

[x] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
[ ] Documentation only

Which issue/s this PR fixes

fixes #2924

How Has This Been Tested?

$ make test
$ make lua-test
$ make kind-e2e-test

Also built the image and tested it in the production cluster.

Checklist:

[x] My change requires a change to the documentation.
[x] I have updated the documentation accordingly.
[x] I've read the CONTRIBUTION guide
[x] I have added tests to cover my changes.
[x] All new and existing tests passed.

Jun 23 '22 14:06 nailgun

Hi @nailgun. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Jun 23 '22 14:06 k8s-ci-robot

/kind documentation /triage accepted /priority backlog

Jun 23 '22 14:06 strongjz

/ok-to-test

Jul 02 '22 18:07 strongjz

/assign @rikatz

I would like to get this fix into 1.2.2 release. But this and k8s 1.24 we should probably bump 1.3.0 and add a note about the metrics updates.

@nailgun are they breaking changes in the metrics?

thank you for this great update all around, We may need to rebase after we get 1.24 testing in CI. https://github.com/kubernetes/ingress-nginx/pull/8798

Jul 10 '22 22:07 strongjz

There are no breaking changes. New metrics are added and old marked as deprecated as they are superseded by new ones.

Jul 12 '22 07:07 nailgun

@strongjz any blocker on getting this merged?

Sep 10 '22 10:09 zifeo

/lgtm

Sep 30 '22 14:09 strongjz

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nailgun, strongjz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [strongjz]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Sep 30 '22 14:09 k8s-ci-robot

Just came here to say that nginx_ingress_controller_success is a really confusing name for "Cumulative number of Ingress controller reload operations". Someone should rename this to nginx_ingress_controller_reloads_total to be in line with naming best practices.

Apr 22 '23 12:04 discordianfish

Just came here to say that nginx_ingress_controller_success is a really confusing name for "Cumulative number of Ingress controller reload operations". Someone should rename this to nginx_ingress_controller_reloads_total to be in line with naming best practices.

@discordianfish if possible, kindly submit a PR because your statement seems reasonable and a PR will move the thought forward.

Apr 24 '23 18:04 longwuyuan

I might get to it but can't promise, so if you want this to happen maybe submit a PR :) The change is trivial, figuring out how to roll it out without breaking people monitoring these metrics currently is not.. So this might be best driven by a maintainer.

Apr 26 '23 11:04 discordianfish

ingress-nginx ingress-nginx copied to clipboard

Consistent prometheus metric names and documentation

What this PR does / why we need it:

Types of changes

Which issue/s this PR fixes

How Has This Been Tested?

Checklist:

ingress-nginx
ingress-nginx copied to clipboard