ingress-nginx
ingress-nginx copied to clipboard
Expose 2 more Prometheus metrics if it is possible
What do you want to happen? If it is possible, we would like to have 2 more Prometheus metrics exposed:
- nginx_worker_processes_count
- total_max_connections
Why could this be useful?
-
nginx_worker_processes_count
. Currently, it is available the total number of processes metric (https://github.com/kubernetes/ingress-nginx/blob/4328bed66326a8a2eec84e5676e84b001ccf4b36/docs/user-guide/monitoring.md?plain=1#L425). But it is more useful to know how many of those processes are workers that can attend requests -
total_max_connections
. Having thetotal_max_connections
(that can be calculated as worker_processes * max_connections per process), can help us to know when ingress-nginx is overloaded on time configuring an alert. It can be that ingress-nginx has enough CPU/Memory resources but it is bad configured because it's not using its full capacity (for example increasing the worker_processes number).- In the future, that metric could also help to scale-up ingress-nginx pods using that metric for HPA instead of CPU/Memory ones (please, let me know if this can be already done or you know other better ways to do it :) )
Thanks for all your work in this project, you usually say that you are not the standard ingress controller solution, but in my opinion, that is not true, because your important functionatiles have cause that you are an "de facto standard" in K8s community.
If these metrics are disabled by default so as to not impact performance, then it seems like an improvement.
Would you consider submitting a PR, that makes it possible to optionally add + enable these metrics.
Sorry, I think that I expressed wrongly. What I want to say is that those metrics do not exist. So as I am not an expert, I do not know how they could be created :(
ok /help
@longwuyuan: This request has been marked as needing help from a contributor.
Guidelines
Please ensure that the issue body includes answers to the following questions:
- Why are we solving this issue?
- To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
- Does this issue have zero to low barrier of entry?
- How can the assignee reach out to you for help?
For more details on the requirements of such an issue, please see here and ensure that they are met.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help
command.
In response to this:
ok /help
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/triage accepted /priority backlog /help-wanted /good-first-issue
This requires an expert of prometheus and an expert of nginx to understand this controller's metrics
/remove-good-first-issue
Let me try to discovery if it possible!
/assign
Any advance here? 🤞
Hi @luarx ,
We would love to get PRs to add the metrics you mentioned. This update is to state that due to severe shortage of resources, we are even deprecating existing popular features because its hard to support/maintain. Instead the focus is on security & Gateway-API implementation.
So while resolving this with PRs would be most welcome, there is little possibility of allocating resources for this. So there is no action-item here for the project. And its adding to the tally of open issues without a action item.
It would be better to close this issue as it does not track any action item and if anyone wants to contribute in future, then it can be re-opened. WDYT ?
Agree with you, thanks for clarifying @longwuyuan 🙏