ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

Consistent prometheus metric names and documentation

Open nailgun opened this issue 2 years ago • 7 comments

What this PR does / why we need it:

In https://github.com/kubernetes/ingress-nginx/pull/8726 I've added a new metric and noticed some inconsistency in metric names:

  • _request_duration_seconds Histogram

    The request processing time in milliseconds (affected by client speed)

    nginx var: request_time

  • _response_duration_seconds Histogram

    The time spent on receiving the response from the upstream server (affected by client speed)

    nginx var: upstream_response_time

  • _header_duration_seconds Histogram ADDED

    The time spent on receiving the first header from the upstream server

    nginx var: upstream_header_time

  • _connect_duration_seconds Histogram ADDED

    The time spent on establishing a connection with the upstream server

    nginx var: upstream_connect_time

  • _response_size Histogram

    The response length (including request line, header, and request body)

    nginx var: bytes_sent

  • _request_size Histogram

    The request length (including request line, header, and request body)

    nginx var: request_length

  • _requests Counter

    The total number of client requests

  • _bytes_sent Histogram DEPRECATED

    The number of bytes sent to a client. Deprecated, use _response_size

    nginx var: bytes_sent

  • _ingress_upstream_latency_seconds Summary DEPRECATED

    Upstream service latency per Ingress. Deprecated, use _connect_duration_seconds

    nginx var: upstream_connect_time

  • ingress_upstream_header_seconds Summary REMOVED, I've added it in the previous PR and it has not been released

I also added documentation for all metrics to close https://github.com/kubernetes/ingress-nginx/issues/2924

Types of changes

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation only

Which issue/s this PR fixes

fixes #2924

How Has This Been Tested?

$ make test
$ make lua-test
$ make kind-e2e-test

Also built the image and tested it in the production cluster.

Checklist:

  • [x] My change requires a change to the documentation.
  • [x] I have updated the documentation accordingly.
  • [x] I've read the CONTRIBUTION guide
  • [x] I have added tests to cover my changes.
  • [x] All new and existing tests passed.

nailgun avatar Jun 23 '22 14:06 nailgun

Hi @nailgun. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jun 23 '22 14:06 k8s-ci-robot

/kind documentation /triage accepted /priority backlog

strongjz avatar Jun 23 '22 14:06 strongjz

/ok-to-test

strongjz avatar Jul 02 '22 18:07 strongjz

/assign @rikatz

I would like to get this fix into 1.2.2 release. But this and k8s 1.24 we should probably bump 1.3.0 and add a note about the metrics updates.

@nailgun are they breaking changes in the metrics?

thank you for this great update all around, We may need to rebase after we get 1.24 testing in CI. https://github.com/kubernetes/ingress-nginx/pull/8798

strongjz avatar Jul 10 '22 22:07 strongjz

There are no breaking changes. New metrics are added and old marked as deprecated as they are superseded by new ones.

nailgun avatar Jul 12 '22 07:07 nailgun

@strongjz any blocker on getting this merged?

zifeo avatar Sep 10 '22 10:09 zifeo

/lgtm

strongjz avatar Sep 30 '22 14:09 strongjz

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nailgun, strongjz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Sep 30 '22 14:09 k8s-ci-robot

Just came here to say that nginx_ingress_controller_success is a really confusing name for "Cumulative number of Ingress controller reload operations". Someone should rename this to nginx_ingress_controller_reloads_total to be in line with naming best practices.

discordianfish avatar Apr 22 '23 12:04 discordianfish

Just came here to say that nginx_ingress_controller_success is a really confusing name for "Cumulative number of Ingress controller reload operations". Someone should rename this to nginx_ingress_controller_reloads_total to be in line with naming best practices.

@discordianfish if possible, kindly submit a PR because your statement seems reasonable and a PR will move the thought forward.

longwuyuan avatar Apr 24 '23 18:04 longwuyuan

I might get to it but can't promise, so if you want this to happen maybe submit a PR :) The change is trivial, figuring out how to roll it out without breaking people monitoring these metrics currently is not.. So this might be best driven by a maintainer.

discordianfish avatar Apr 26 '23 11:04 discordianfish