controller-runtime icon indicating copy to clipboard operation
controller-runtime copied to clipboard

✨ Introduce pprof server to manager

Open zqzten opened this issue 3 years ago • 11 comments

This PR introduces a configurable standalone server for serving pprof to manager, just like the health and metrics server.

fixes #1779

Additional thoughts: I've been aware of metricsExtraHandlers introduced in #824 which might be an alternative for end users to serve pprof, but I don't think it's quite appropriate to do so, mainly for these two reasons:

  1. pprof is a language built-in feature which is widely used, so it's more appropriate to make its server built-in in the library rather than let end users construct and run themselves in every controller.
  2. metricsExtraHandlers bind all handlers to metrics listener which is not that general and flexible, it cannot meet cases that user want to expose metrics and pprof by different listeners (common use case: a 0.0.0.0: listener for serving metrics and a 127.0.0.1: listener for serving pprof).

zqzten avatar Jun 25 '22 08:06 zqzten

Hi @zqzten. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jun 25 '22 08:06 k8s-ci-robot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zqzten To complete the pull request process, please assign pwittrock after the PR has been reviewed. You can assign the PR to them by writing /assign @pwittrock in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Jun 25 '22 08:06 k8s-ci-robot

TBH, I'm against this one. This is very tempting to add functionality such as this to your solution, but enabling it on production might cause a sensitive information leakage.

I suggest adding a "hint" in your "DEVELOPMENT.MD" file as I did here: https://github.com/natun-ai/natun/blob/master/CONTRIBUTING.md#profiling

AlmogBaku avatar Jun 30 '22 20:06 AlmogBaku

This is very tempting to add functionality such as this to your solution, but enabling it on production might cause a sensitive information leakage.

That's true if we enable pprof and bind it to 0.0.0.0:<port> on production. But I think this PR is ok, since it just provides an option which is disabled by default and we can add doc comments to recommend ppl only bind it to localhost.

FillZpp avatar Jul 01 '22 03:07 FillZpp

@AlmogBaku Thanks for the security remind, but I don't think it as a blocking issue as we have ways to protect it:

  • By binding its listener to 127.0.0.1:, as @FillZpp suggested, which is a very common and naive solution. This is also why seperated listener is introduced instead of reusing metricsExtraHandlers.
  • By protecting pprof server by kube-rbac-proxy, which is also the default solution to protect metrics server in kubebuilder. This is a sophisticated solution which is similarly used by kube-controller-manager (it does the rbac proxy by itself though).

Security is very important indeed, I'll add some doc comments on it, also we can put best practice usage in kubebuilder after this feature is added.

zqzten avatar Jul 01 '22 03:07 zqzten

/ok-to-test

/cc @alvaroaleman

FillZpp avatar Jul 01 '22 08:07 FillZpp

what about adding a build tag as a solution?

AlmogBaku avatar Jul 01 '22 16:07 AlmogBaku

what about adding a build tag as a solution?

I'm not sure whether this solution is appropriate in this project. Let's just wait for opinions from other maintainers.

zqzten avatar Jul 02 '22 11:07 zqzten

Kind of torn here. Not having pprof is extremely annoying in production, but it definitely exposes sensitive information.

@AlmogBaku what is the reason why you would prefer build tags over off-by-default, i.E. what is the problem you see from having it compiled in but disabled?

alvaroaleman avatar Jul 06 '22 14:07 alvaroaleman

Exposing this setting using a build tag is like having a “remote debugger” - possible, but should be allowed implicitly and temporary on production mode

AlmogBaku avatar Jul 06 '22 15:07 AlmogBaku

Exposing this setting using a build tag is like having a “remote debugger” - possible, but should be allowed implicitly and temporary on production mode

I actually disagree with that. Pprof needs to always be enabled or it is useless. If it requires a binary restart of any kind it becomes impossible to debug in situations where there is no clear reproducer. The remaining concern here is that we can't tell if ppl properly wall off connectivity to their controllers, which is why it might be problematic to enable by default.

alvaroaleman avatar Jul 06 '22 15:07 alvaroaleman

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zqzten Once this PR has been reviewed and has the lgtm label, please assign pwittrock for approval by writing /assign @pwittrock in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Dec 09 '22 08:12 k8s-ci-robot

@zqzten: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-controller-runtime-test-master 833a6e6eae88ffd2ef12e39074cb07960b140ed4 link true /test pull-controller-runtime-test-master

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot avatar Dec 09 '22 12:12 k8s-ci-robot

/cc @alvaroaleman @vincepri PTAL, thanks.

zqzten avatar Dec 14 '22 12:12 zqzten

@zqzten: GitHub didn't allow me to request PR reviews from the following users: PTAL.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @alvaroaleman @vincepri PTAL

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Dec 14 '22 12:12 k8s-ci-robot

Hi team, could I know when this CL would be merged and released?

fovgrubby128 avatar Jan 10 '23 19:01 fovgrubby128

@alvaroaleman @vincepri @FillZpp Mind give some further points on this? It would be very appreciated. There're many ppl around calling for this feature.

zqzten avatar Mar 13 '23 12:03 zqzten

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri, zqzten

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Apr 12 '23 18:04 k8s-ci-robot

A dumb question here, it would be released in the next version, right?

fovgrubby128 avatar Apr 12 '23 18:04 fovgrubby128

Yes, that's the plan

sbueringer avatar Apr 13 '23 03:04 sbueringer

Yes, that's the plan

Could I know the next release date please since we want to introduce this change in our code for some deadline? thanks!

fovgrubby128 avatar Apr 18 '23 00:04 fovgrubby128

There is no fixed date as far as I know. But I think we want to get https://github.com/kubernetes-sigs/controller-runtime/pull/2189 in first

sbueringer avatar Apr 18 '23 08:04 sbueringer