xavier chang
xavier chang
Same problem: https://github.com/volcano-sh/volcano/issues/3186. We can fix it to resolve both of them.
@LivingCcj @lowang-bh You're welcome to fix this: )
This phenomenon has recurred when vgpu resource is insufficient. Here are volcano scheduler logs: ``` I0319 02:51:05.281886 1 preempt.go:43] Enter Preempt ... I0319 02:51:05.281895 1 job_info.go:728] job podgroup-f354bb74-7c3d-4429-aa92-3c02a7ab99ba/kubeflow actual: map[:1],...
> I might experience a similar issue. My cluster has 4 GPU nodes. First, I start a 4-nodes job with low priority, which gets scheduled and running. A little later...
> > > I might experience a similar issue. My cluster has 4 GPU nodes. First, I start a 4-nodes job with low priority, which gets scheduled and running. A...
> This issue has been successfully reproduced in volcano 1.16 and will be resolved in the future. Any progress here?
The running state of volcano job derived from pod, so volcano job shows running state when pod is running, even pod is not ready, it's in running state, so volcano...
> > 我在用到了k8s的readyprobe探针,当业务未完成部署时 pod的ready状态未0/1。当pod的ready状态发生变化时,volcano的job的状态一直都处于running状态。检测不到volcanojob的status状态变化,这是否是volcano需要优化的地方?如果不是,怎么配置可以让volcano或者k8s感知到pod的就绪状态的变化。   > > Also meet the problem; pod ready status volcano cannot get ; handle it in application layer Yeah, voclano won't care pod's ready...
> > > > 我在用到了k8s的readyprobe探针,当业务未完成部署时 pod的ready状态未0/1。当pod的ready状态发生变化时,volcano的job的状态一直都处于running状态。检测不到volcanojob的status状态变化,这是否是volcano需要优化的地方?如果不是,怎么配置可以让volcano或者k8s感知到pod的就绪状态的变化。   > > > > > > > > > Also meet the problem; pod ready status volcano cannot get ; handle it...