Lei Xue
Lei Xue
Hello @kpouget , The `slotsPerWorker` comes from MPI's slots, you may ref it at [here](https://www.open-mpi.org/faq/?category=running#slots-without-hostfiles). The `Worker` means worker pod, you can setup it in yaml file. You may use...
@kpouget The replicas number is same as worker number. It will get 4 worker pods if you request 4 worker replicas. The slots will not impact the the number of...
@asahalyft The worker pods status is right, the mpijob's status should be synced with the launcher pod. I think the main problem is caused by the sync error.
@terrytangyuan Do u know which commit does the mpioperator/mpi-operator:latest base?
@asahalyft Could you please try the suggestion of @qifengz ?
@asahalyft I did test it locally with v1, but installed it with yaml file, it works as expected. Yes, you can switch to `v1alpha2`, longer version.
@yuyue9284 Which Volcano version are you using? And could you post your mpi-operator deployment? Now, volcano v1.0.0 did change the PodGroup CRD APIGroup to `volcano.sh`, so you may need use...
@suluner Could you please attach more logs? And what's your network environment, RoCE, IB or others? It will be better to provide your result of `ip a`.
@lvcaiping You can enable batch scheduler with `--gang-scheduling`.