controller-runtime Unsafe assumptions and leaked goroutine in logging

This code is problematic: https://github.com/kubernetes-sigs/controller-runtime/blob/498ee8ae7cc1dfc414f453f98cde3114116e5442/pkg/log/log.go#L53-L70

First, it leaks a goroutine for 30s just by importing the library. Generally its assumed an import is not creating goroutines. The real world impact of this is that various projects (Istio, gRPC, etc) have tests that assert no goroutines are leaking; these fail if controller-runtime is so much as imported.

Another issue is It is safe to assume that if this wasn't set within the first 30 seconds of a binaries. This seems not safe to assert. There is certainly no reason that I cannot set a logger 31s or 30h into the process lifetime. The controller-runtime library can and is used as a small piece of larger binaries; its not always the sole thing running in the process.

Sep 03 '21 23:09 howardjohn

The controller-runtime library can and is used as a small piece of larger binaries; its not always the sole thing running in the process

While we want to improve the use cases around this, especially because the controller-runtime client is so much nicer to work with than client-go for many use cases, we do still largely optimize for writing operators with tools like Kubebuilder and Operator-SDK where controller-runtime is indeed in overarching control of the universe. This code was added to fix some substantial memory leak issues due to dangling loggers. If you have suggestions on how to improve this without hurting the primary use cases, please do let us know.

Sep 04 '21 00:09 coderanger

Hey @howardjohn ,

The reason this logic exists is that the DelegatingLogger ends up consuming a huge amount of memory if no Logger gets set: https://github.com/kubernetes-sigs/controller-runtime/issues/1122

The current logic is the best we could come up with to deal with that situation and "leaking" a goroutine for 30 seconds/not allowing to set a logger after that is definitely less of an issue than the memory consumption.

If you have any ideas on how to improve things here, they are very welcome.

Sep 05 '21 22:09 alvaroaleman

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Dec 04 '21 23:12 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Jan 03 '22 23:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Feb 02 '22 23:02 k8s-triage-robot

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Feb 02 '22 23:02 k8s-ci-robot

I just found this weird goroutine in a debugging session.

If you have any ideas on how to improve things here, they are very welcome.

Why not use klog, like Kubernetes? That would be consistent and not a surprise at all. Most apps that care about logs and use Kubernetes libraries set klog logger via klog.SetOutput().

/reopen /remove-lifecycle rotten

Aug 04 '22 11:08 ash2k

@ash2k: Reopened this issue.

In response to this:

I just found this weird goroutine in a debugging session.

If you have any ideas on how to improve things here, they are very welcome.

Why not use klog, like Kubernetes? That would be consistent and not a surprise at all. Most apps that care about logs and use Kubernetes libraries set klog logger via klog.SetOutput().

/reopen /remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Aug 04 '22 11:08 k8s-ci-robot

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Nov 02 '22 11:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Dec 02 '22 11:12 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Jan 01 '23 12:01 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Jan 01 '23 12:01 k8s-ci-robot

/reopen /remove-lifecycle rotten

Feb 28 '23 09:02 ash2k

@ash2k: Reopened this issue.

In response to this:

/reopen /remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Feb 28 '23 09:02 k8s-ci-robot

controller-runtime controller-runtime copied to clipboard

Unsafe assumptions and leaked goroutine in logging

controller-runtime
controller-runtime copied to clipboard