kubedl
kubedl copied to clipboard
refactor(mpi): remove global launcherRunsWorkload flag.
Signed-off-by: SimonCqk [email protected]
Ⅰ. Describe what this PR does
remove launcherRunsWorkload
global startup flag, which can be inferred by whether has Launcher
role in job spec.
II. Does this pull request fix one issue?
fix #194
III. Special notes for reviewers if any.
Codecov Report
Merging #198 (d1cbc96) into master (0eb96cd) will not change coverage. The diff coverage is
n/a
.
@@ Coverage Diff @@
## master #198 +/- ##
=======================================
Coverage 23.18% 23.18%
=======================================
Files 75 75
Lines 4502 4502
=======================================
Hits 1044 1044
Misses 3319 3319
Partials 139 139
Flag | Coverage Δ | |
---|---|---|
unittests | 23.18% <ø> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 0eb96cd...d1cbc96. Read the comment docs.
does this mean the launcher pod always run the workload ? is this expected ?
does this mean the launcher pod always run the workload ? is this expected ?
if launcher role is in job spec, then it is, otherwise workers should tigger workload running.
does this mean the launcher pod always run the workload ? is this expected ?
if launcher role is in job spec, then it is, otherwise workers should tigger workload running.
then, why do we need the launcher pod in the first place, if it runs the same as worker..
anyway, the patch looks ok
then, why do we need the launcher pod in the first place, if it runs the same as worker..
Launcher
pod execution scripts are generated by controller and it triggers training by hooking scripts in OpenMPI framework extension points, if only Worker
roles in job spec, user has to handle it manually.
does this mean the launcher pod always run the workload ? is this expected ?
if launcher role is in job spec, then it is, otherwise workers should tigger workload running.
I don't think it is ok. Because for the mpijob, we always have the launcher role in job spec.
does this mean the launcher pod always run the workload ? is this expected ?
if launcher role is in job spec, then it is, otherwise workers should tigger workload running.
I don't think it is ok. Because for the mpijob, we always have the launcher role in job spec.
yes, that's what I mean, normally mpijob will be driven by launcher commands. In that case, should launcherRunsWorkload
always be true?
does this mean the launcher pod always run the workload ? is this expected ?
if launcher role is in job spec, then it is, otherwise workers should tigger workload running.
I don't think it is ok. Because for the mpijob, we always have the launcher role in job spec.
yes, that's what I mean, normally mpijob will be driven by launcher commands. In that case, should
launcherRunsWorkload
always be true?
In my opinion, launcherRunsWorkload
should not always be true.