kubedl
kubedl copied to clipboard
refactor(mpi): remove global launcherRunsWorkload flag.
Signed-off-by: SimonCqk [email protected]
Ⅰ. Describe what this PR does
remove launcherRunsWorkload global startup flag, which can be inferred by whether has Launcher role in job spec.
II. Does this pull request fix one issue?
fix #194
III. Special notes for reviewers if any.
Codecov Report
Merging #198 (d1cbc96) into master (0eb96cd) will not change coverage. The diff coverage is
n/a.
@@ Coverage Diff @@
## master #198 +/- ##
=======================================
Coverage 23.18% 23.18%
=======================================
Files 75 75
Lines 4502 4502
=======================================
Hits 1044 1044
Misses 3319 3319
Partials 139 139
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 23.18% <ø> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 0eb96cd...d1cbc96. Read the comment docs.
does this mean the launcher pod always run the workload ? is this expected ?
does this mean the launcher pod always run the workload ? is this expected ?
if launcher role is in job spec, then it is, otherwise workers should tigger workload running.
does this mean the launcher pod always run the workload ? is this expected ?
if launcher role is in job spec, then it is, otherwise workers should tigger workload running.
then, why do we need the launcher pod in the first place, if it runs the same as worker..
anyway, the patch looks ok
then, why do we need the launcher pod in the first place, if it runs the same as worker..
Launcher pod execution scripts are generated by controller and it triggers training by hooking scripts in OpenMPI framework extension points, if only Worker roles in job spec, user has to handle it manually.
does this mean the launcher pod always run the workload ? is this expected ?
if launcher role is in job spec, then it is, otherwise workers should tigger workload running.
I don't think it is ok. Because for the mpijob, we always have the launcher role in job spec.
does this mean the launcher pod always run the workload ? is this expected ?
if launcher role is in job spec, then it is, otherwise workers should tigger workload running.
I don't think it is ok. Because for the mpijob, we always have the launcher role in job spec.
yes, that's what I mean, normally mpijob will be driven by launcher commands. In that case, should launcherRunsWorkload always be true?
does this mean the launcher pod always run the workload ? is this expected ?
if launcher role is in job spec, then it is, otherwise workers should tigger workload running.
I don't think it is ok. Because for the mpijob, we always have the launcher role in job spec.
yes, that's what I mean, normally mpijob will be driven by launcher commands. In that case, should
launcherRunsWorkloadalways be true?
In my opinion, launcherRunsWorkload should not always be true.