kruise
kruise copied to clipboard
[feature request] BroadcastJob 完成策略为 Never 时潜在的问题
参考 bcj 实现时有的一些疑问,没有具体的版本
What would you like to be added: 1、常驻的 bcj 任务,能否添加状态为 Succeeded 的 Pod GC机制
Why is this needed: 1、当 CompletionPolicy 为 Never时,在大规模集群下会长时间存在大量的 Succeeded 的 Pod,这部分是否会存在降低控制面性能的问题 2、如果 Succeeded 数量超过 terminated-pod-gc-threshold 定义,对于 k8s 本身的回收是否会影响 bcj 本身,如被回收后触发协调再次达到期望状态的 pod 数?
I have some ideas: If the succeeded pod record is deleted, where is this record to be kept? And is it in memory or in k8s? If it's in k8s, using configmap storage, there will be a lot of single-point updates, which should consume more cpu than keeping all the pod records. I think it would be better to keep it in memory and use some checkpoint mechanism.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.