slotPerWorker attribute and number of worker per node
Hello, I am confused by the meaning of the slotPerWorker attribute:
type MPIJobSpec struct {
// Specifies the number of slots per worker used in hostfile.
// Defaults to 1.
// +optional
SlotsPerWorker *int32 `json:"slotsPerWorker,omitempty"`
...
}
Does worker refers to worker pods or worker nodes?
I would like to deploy 1 MPI job per worker node, but it seems to to be what the MPI operator does:
(MPI worker pods ...-worker-0 and ...-worker-1 are both on the node worker05)
Is this a bug, or is there a way to deploy one pod per node?
Issue-Label Bot is automatically applying the labels:
| Label | Probability |
|---|---|
| kind/question | 0.76 |
Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.
Hello @kpouget ,
The slotsPerWorker comes from MPI's slots, you may ref it at here.
The Worker means worker pod, you can setup it in yaml file. You may use pod non-affinity feature to schedule them.
Hello @carmark,
The
Workermeans worker pod, you can setup it in yaml file.
I'm not sure to agree/observe what you're saying:

I request 4 worker replicas and 2 slots per worker, but I get 4 worker pods
edit: there is a mix of solver/mesher in the screenshot but both run with the same settings
@kpouget The replicas number is same as worker number. It will get 4 worker pods if you request 4 worker replicas.
The slots will not impact the the number of worker pods, it will only be setup in mpi hostfile(/etc/mpi/hostfile).
It seems that this was resolved. If you have any questions, feel free to open new issues. /close
@tenzen-y: Closing this issue.
In response to this:
It seems that this was resolved. If you have any questions, feel free to open new issues. /close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.