karmada
karmada copied to clipboard
Call shutdown after being stopped
What type of PR is this? /kind cleanup /kind bug
What this PR does / why we need it:
we forget to call shutdown
to close workqueue in these two components.
- karmada-scheduler
- karmada-metrics-adapter
Which issue(s) this PR fixes: Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
NONE
Codecov Report
Attention: Patch coverage is 0%
with 1 lines
in your changes are missing coverage. Please review.
Project coverage is 53.19%. Comparing base (
81b8c4c
) to head (0708b7d
). Report is 18 commits behind head on master.
Files | Patch % | Lines |
---|---|---|
pkg/scheduler/scheduler.go | 0.00% | 1 Missing :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## master #4916 +/- ##
==========================================
+ Coverage 53.12% 53.19% +0.06%
==========================================
Files 251 252 +1
Lines 20415 20509 +94
==========================================
+ Hits 10846 10910 +64
- Misses 8855 8880 +25
- Partials 714 719 +5
Flag | Coverage Δ | |
---|---|---|
unittests | 53.19% <0.00%> (+0.06%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
/cc @RainbowMango @XiShanYongYe-Chang to take a look
@whitewindmills: GitHub didn't allow me to request PR reviews from the following users: to, take, a, look.
Note that only karmada-io members and repo collaborators can review this PR, and authors cannot review their own PRs.
In response to this:
/cc @RainbowMango @XiShanYongYe-Chang to take a look
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Thanks /assign
Just a little question, if the queue is not shut down, what is the impact? Memory leak?
ShutDown
will cause workqueue to ignore all new items added to it and immediately instruct the worker goroutines(include metrics goroutine) to exit. it is used to exit the currently running Goroutines gracefully. this is essential for managing resource cleanup and preventing memory leaks.
Thanks a lot~ /lgtm /cc @chaunceyjiang @RainbowMango
Do we need the release note and cherry pick this patch to the previous release?
ping @RainbowMango @chaunceyjiang
Do we need the release note and cherry pick this patch to the previous release?
+1
/assign @RainbowMango
ShutDown will cause workqueue to ignore all new items added to it and immediately instruct the worker goroutines(include metrics goroutine) to exit. it is used to exit the currently running Goroutines gracefully. this is essential for managing resource cleanup and preventing memory leaks.
When the controller is going to stop(close the stopCh), all derived go routines will be closed anyway, not in a graceful way. But does it cause memory leaks?
Another questions, do we need to stop the others, like InformerFactory
, TypedInformerManager
? In https://github.com/karmada-io/karmada/blob/6f29134f1e3de90029e1e020dfab01e5df6d49cf/pkg/metricsadapter/controller.go#L53-L52
But does it cause memory leaks?
not absolutely, it depends on what is done in this goroutine. but we'd better follow good practices.
do we need to stop the others, like InformerFactory, TypedInformerManager?
-
for
InformerFactory
, judging from its implementation, it seems that calling theShutdown
function does not make sense. https://github.com/karmada-io/karmada/blob/6f29134f1e3de90029e1e020dfab01e5df6d49cf/pkg/generated/informers/externalversions/factory.go#L146-L152 -
for
TypedInformerManager
, even if theStopInstance
function is called, https://github.com/karmada-io/karmada/blob/6f29134f1e3de90029e1e020dfab01e5df6d49cf/pkg/util/fedinformer/typedmanager/multi-cluster-manager.go#L30-L58 it will not take effect because the controller does not use the internalstopCh
channel at all during initialization. this seems to be illegal. maybe we need other PR to modify this part of the logic. https://github.com/karmada-io/karmada/blob/6f29134f1e3de90029e1e020dfab01e5df6d49cf/pkg/metricsadapter/controller.go#L71 https://github.com/karmada-io/karmada/blob/6f29134f1e3de90029e1e020dfab01e5df6d49cf/pkg/metricsadapter/controller.go#L134-L140context.TODO().Done()
is used as a stop semaphore.
not absolutely, it depends on what is done in this goroutine. but we'd better follow good practices.
Yeah, Gracefully shutting down is good. I just want to clarify the impact as well as the benefit, the decision regarding to backport or not is based on the analysis. In my opinion, there won't be any memory leak issue here, so no need to backport this. One of the benefits I can get is that, for the informer, a graceful shutdown could close the connection between the target Kubernetes API server in time, there isn't a need to wait time out for the apiserver side to close the connection.
One of the benefits I can get is that, for the informer, a graceful shutdown could close the connection between the target Kubernetes API server in time, there isn't a need to wait time out for the apiserver side to close the connection.
I also get it.
it will not take effect because the controller does not use the internal stopCh channel at all during initialization. this seems to be illegal. maybe we need other PR to modify this part of the logic.
Just quickly looked at the code, yes, the usage of Stopch and context seems casual, welcome to optimize it a little bit.
ping @RainbowMango
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: RainbowMango
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~cmd/metrics-adapter/OWNERS~~ [RainbowMango]
- ~~pkg/metricsadapter/OWNERS~~ [RainbowMango]
- ~~pkg/scheduler/OWNERS~~ [RainbowMango]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment