flyte icon indicating copy to clipboard operation
flyte copied to clipboard

When there's a triggered LP SubWorkflow use the maxParallelism of LP

Open RRap0so opened this issue 1 year ago • 7 comments

Tracking issue

Closes #5555

Why are the changes needed?

What changes were proposed in this pull request?

I'm proposing that Sub Executions should respect the limit set of a remote LaunchPlan. The authors of the LaunchPlan might have a reason to have this value set and could cause problems if the Main Execution has a higher limit

How was this patch tested?

This was tested using a Workflow that triggers a remote launch plan in a test environment

Check all the applicable boxes

  • [ ] I updated the documentation accordingly.
  • [x] All new and existing tests passed.
  • [x] All commits are signed-off.

RRap0so avatar Jul 11 '24 11:07 RRap0so

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 61.08%. Comparing base (81afb76) to head (151f698). Report is 427 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5554      +/-   ##
==========================================
+ Coverage   60.98%   61.08%   +0.09%     
==========================================
  Files         796      942     +146     
  Lines       51647    76187   +24540     
==========================================
+ Hits        31498    46539   +15041     
- Misses      17249    26412    +9163     
- Partials     2900     3236     +336     
Flag Coverage Δ
unittests-datacatalog 58.90% <ø> (-10.42%) :arrow_down:
unittests-flyteadmin 58.78% <100.00%> (+0.05%) :arrow_up:
unittests-flytecopilot 29.92% <ø> (+12.13%) :arrow_up:
unittests-flytectl 65.90% <ø> (-1.55%) :arrow_down:
unittests-flyteidl 80.14% <ø> (+1.08%) :arrow_up:
unittests-flyteplugins 63.28% <ø> (+1.43%) :arrow_up:
unittests-flytepropeller 58.49% <ø> (+1.07%) :arrow_up:
unittests-flytestdlib 66.58% <ø> (+0.89%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Jul 11 '24 11:07 codecov[bot]

Took a good look at the tests and couldn't really figure out to make sure this is covered. Let me know if you have any tips!

RRap0so avatar Jul 11 '24 13:07 RRap0so

@hamersaw sure thing.

We treat our LPs like libraries that are shared to whoever wants to use them. Depending on what systems they touch, the author of those LPs might want to be able to pre-define a max Parallelism. By adding this, we check if the launchPlan has a parentNodeId meaning that is not main execution, and if so, check if there's a parallelism defined by the author of said LP. Main execution keeps it's parallelism, sub-executions use defined LP first and if not defined then inherits whatever comes from the Main execution.

There's also the missing piece from the workflow executor side which I'm planning to work on but just wanted to hear your thoughts first. Screenshot 2024-07-16 at 10 03 53

RRap0so avatar Jul 16 '24 08:07 RRap0so

@hamersaw sure thing.

We treat our LPs like libraries that are shared to whoever wants to use them. Depending on what systems they touch, the author of those LPs might want to be able to pre-define a max Parallelism. By adding this, we check if the launchPlan has a parentNodeId meaning that is not main execution, and if so, check if there's a parallelism defined by the author of said LP. Main execution keeps it's parallelism, sub-executions use defined LP first and if not defined then inherits whatever comes from the Main execution.

There's also the missing piece from the workflow executor side which I'm planning to work on but just wanted to hear your thoughts first.

What is the behavior right now? In FlytePropeller I see the MaxParallelism being set on the ExecutionCreateRequest for executing a LP is set to the MaxParallelism for the current execution. This allows things like overriding MaxParallelism from the UI / flytekit instead of relying on the LaunchPlanSpec definition.

hamersaw avatar Jul 17 '24 16:07 hamersaw

@hamersaw sure thing. We treat our LPs like libraries that are shared to whoever wants to use them. Depending on what systems they touch, the author of those LPs might want to be able to pre-define a max Parallelism. By adding this, we check if the launchPlan has a parentNodeId meaning that is not main execution, and if so, check if there's a parallelism defined by the author of said LP. Main execution keeps it's parallelism, sub-executions use defined LP first and if not defined then inherits whatever comes from the Main execution. There's also the missing piece from the workflow executor side which I'm planning to work on but just wanted to hear your thoughts first.

What is the behavior right now? In FlytePropeller I see the MaxParallelism being set on the ExecutionCreateRequest for executing a LP is set to the MaxParallelism for the current execution. This allows things like overriding MaxParallelism from the UI / flytekit instead of relying on the LaunchPlanSpec definition.

As it is right now, sub-executions will just inherit whatever is set from the main execution.

RRap0so avatar Jul 18 '24 09:07 RRap0so

Cleaning stale PRs. Please reopen if you wan to discuss this further.

eapolinario avatar Mar 03 '25 19:03 eapolinario

Hey @eapolinario @hamersaw we still have folks every now and then asking for this feature. Did you had a chance to take a look at it?

RRap0so avatar Mar 21 '25 11:03 RRap0so