koordinator
koordinator copied to clipboard
koordlet: kill container after calling eviction api success
Ⅰ. Describe what this PR does
fix #1758
Ⅱ. Does this pull request fix one issue?
Ⅲ. Describe how to verify it
Ⅳ. Special notes for reviews
V. Checklist
- [x] I have written necessary docs and comments
- [x] I have added necessary unit tests and integration tests
- [x] All checks passed in
make test
Codecov Report
Attention: Patch coverage is 73.17073%
with 11 lines
in your changes are missing coverage. Please review.
Project coverage is 68.56%. Comparing base (
eed98fa
) to head (423e5b2
). Report is 15 commits behind head on main.
Files | Patch % | Lines |
---|---|---|
...let/qosmanager/plugins/memoryevict/memory_evict.go | 52.63% | 9 Missing :warning: |
.../koordlet/qosmanager/plugins/cpuevict/cpu_evict.go | 87.50% | 1 Missing and 1 partial :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #1759 +/- ##
==========================================
- Coverage 68.56% 68.56% -0.01%
==========================================
Files 430 430
Lines 39383 39408 +25
==========================================
+ Hits 27004 27020 +16
- Misses 10043 10048 +5
- Partials 2336 2340 +4
Flag | Coverage Δ | |
---|---|---|
unittests | 68.56% <73.17%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
memory evict is just like oom killer, which dose not restricted by pdb. by default, memory evict should always kill containers even the eviction is failed for rpc timeout for example. so it is better to add flag to control the default logic.
memory evict is just like oom killer, which dose not restricted by pdb. by default, memory evict should always kill containers even the eviction is failed for rpc timeout for example. so it is better to add flag to control the default logic.
How about add a koordlet feature-gate named DoNotEvictPodIfCallEvictionAPIFailed
, maybe it's too long, WDYT?
memory evict is just like oom killer, which dose not restricted by pdb. by default, memory evict should always kill containers even the eviction is failed for rpc timeout for example. so it is better to add flag to control the default logic.
How about add a koordlet feature-gate named
DoNotEvictPodIfCallEvictionAPIFailed
, maybe it's too long, WDYT?
feature-gate
means new features, which will be iterated from alpha(default=false), beta(default=true), and ga.
so an argument named --only-evict-by-api
(default=false) is better, and only call eviction api when enabled.
also, please add a TODO comment for supporting fine-grained eviction arguments just like kubelet
memory evict is just like oom killer, which dose not restricted by pdb. by default, memory evict should always kill containers even the eviction is failed for rpc timeout for example. so it is better to add flag to control the default logic.
How about add a koordlet feature-gate named
DoNotEvictPodIfCallEvictionAPIFailed
, maybe it's too long, WDYT?
feature-gate
means new features, which will be iterated from alpha(default=false), beta(default=true), and ga. so an argument named--only-evict-by-api
(default=false) is better, and only call eviction api when enabled.also, please add a TODO comment for supporting fine-grained eviction arguments just like kubelet
Maybe you misunderstood me, I want to kill container on when calling eviction api returns ok, not eviction only by API. @zwzhang0107
want
Actually, after the eviction api returns ok, there is no need to kill container if you want to solve the grace termination or PDB problem. It seems that defines a self-defined evictor args(--evictor="default/only-eviction/...", which means --only-evict-by-api) can solve your concerns?
want
Actually, after the eviction api returns ok, there is no need to kill container if you want to solve the grace termination or PDB problem. It seems that defines a self-defined evictor args(--evictor="default/only-eviction/...", which means --only-evict-by-api) can solve your concerns?
Got you, I will file a new patch later.
@zwzhang0107 PTAL
This issue has been automatically marked as stale because it has not had recent activity. This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied, the issue is closed You can: - Mark this issue or PR as fresh with
/remove-lifecycle stale
- Close this issue or PR with
/close
Thank you for your contributions.
/lgtm
@j4ckstraw please solve the code conflicts, and we are good to go.
@zwzhang0107 rebased
/lgtm /approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: zwzhang0107
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~pkg/koordlet/OWNERS~~ [zwzhang0107]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment