secretpad icon indicating copy to clipboard operation
secretpad copied to clipboard

节点问题

Open littleunicorn opened this issue 1 year ago • 4 comments

Issue Type

Install/Deploy

Have you searched for existing documents and issues?

Yes

OS Platform and Distribution

docker服务

All_in_one Version

v1.6.1b0

Module type

secretflow

Module version

v1.6.1b0

What happend and What you expected to happen.

之前可以跑通的任务流,现在执行一直报错,隐私求交就一直报错

Log output.

Can't schedule the domain cf-model pod mjal-tkzzeylo-node-37-0 because task resource mjal-tkzzeylo-node-37-05238f8e3a51 status phase isn't schedulable, domain [cf-external] can not reserve resources for pods
2024-07-15 12:01:01.820 WARN kusciascheduling/kusciascheduling.go:246 PreBind pod cf-model/mjal-tkzzeylo-node-37-0 failed, domain [cf-external] can not reserve resources for pods
E0715 12:01:01.820092    1480 schedule_one.go:876] "Error scheduling pod; retrying" err="domain [cf-external] can not reserve resources for pods" pod="cf-model/mjal-tkzzeylo-node-37-0"
2024-07-15 12:01:01.820 INFO nlog/nlog.go:77 E0715 12:01:01.820092    1480 schedule_one.go:876] "Error scheduling pod; retrying" err="domain [cf-external] can not reserve resources for pods" pod="cf-model/mjal-tkzzeylo-node-37-0"

littleunicorn avatar Jul 15 '24 09:07 littleunicorn

image

littleunicorn avatar Jul 15 '24 09:07 littleunicorn

明天拉你一起投屏看下

aokaokd avatar Jul 15 '24 10:07 aokaokd

明天拉你一起投屏看下

什么时候?

littleunicorn avatar Jul 16 '24 02:07 littleunicorn

已沟通解决,是因为磁盘满了导致任务无法下发因此出现的问题

aokaokd avatar Jul 16 '24 07:07 aokaokd