BurningRock
BurningRock
> Can you describe the specific scene? training on multi node using deepspeed. in this case, need meet 2 condition. 1. need to ssh without password beteween pods (may can...
> volcano has plugins to meet your scenario > > https://github.com/volcano-sh/volcano/blob/master/docs/user-guide/how_to_use_env_plugin.md > > https://github.com/volcano-sh/volcano/blob/master/docs/user-guide/how_to_use_ssh_plugin.md > > https://github.com/volcano-sh/volcano/blob/master/docs/user-guide/how_to_use_svc_plugin.md OK 3q, I will try it to see if it can work
> @rockburning > > May I ask if your attempt was successful? yes just use svc plugin;and utilize the headless svc dns record;
> > @rockburning > > May I ask if your attempt was successful? > > yes just use svc plugin;and utilize the headless svc dns record; slot_value="${1:-8}" this is the...
> 目前volcanno的官方示例中,是单容器镜像的mpi运行及通信,volcanno是否支持不同镜像的容器间mpi通信,以及是否有相关案例? 可以读取/etc/volcano 下面生成的host
> 我在用到了k8s的readyprobe探针,当业务未完成部署时 pod的ready状态未0/1。当pod的ready状态发生变化时,volcano的job的状态一直都处于running状态。检测不到volcanojob的status状态变化,这是否是volcano需要优化的地方?如果不是,怎么配置可以让volcano或者k8s感知到pod的就绪状态的变化。   Also meet the problem; pod ready status volcano cannot get ; handle it in application layer
> > > 我在用到了k8s的readyprobe探针,当业务未完成部署时 pod的ready状态未0/1。当pod的ready状态发生变化时,volcano的job的状态一直都处于running状态。检测不到volcanojob的status状态变化,这是否是volcano需要优化的地方?如果不是,怎么配置可以让volcano或者k8s感知到pod的就绪状态的变化。   > > > > > > Also meet the problem; pod ready status volcano cannot get ; handle it in application layer >...
We want to generate a ingress path to user when the pod is really ready,in volcano vcjob can not get the pod ready status ,we have to checl the pod...
why it was removed by later version in v1.12.2 I met the same problem;when used capacity plugin, and all task can not scheduled
> > why it was removed by later version in v1.12.2 I met the same problem;when used capacity plugin, and all task can not scheduled > > What phenomenon? Did...