pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

feat: Add pipeline run parallelism config

Open sduvvuri1603 opened this issue 1 month ago • 12 comments

Summary

  • Replace the previous semaphore/mutex knobs with a single pipeline_run_parallelism option on dsl.PipelineConfig. This lets the API server own the Argo semaphore lifecycles instead of expecting users to edit shared ConfigMaps—eliminating a Kubernetes-heavy workflow and ensuring keys align to <pipeline>/<version>.
  • Thread the new field through SDK, compiler, and backend so the requested concurrency cap lands in Argo’s spec.parallelism.
  • Add the pipeline_with_run_parallelism sample (three-item ParallelFor) to exercise the setting while leaving the workspace fixture focused on workspace behaviour.

Validation

  • SDK and backend goldens now include the updated sample, showing consistent IR and Argo outputs with the parallelism limit.
  • Built custom API server and driver images from this branch, loaded them into a kind cluster, ran the sample, and confirmed that the number of simultaneously running component pods never exceeded the configured limit.
  • Added the parallelism validation helper to the e2e suite (e2e_utils.go + invocation in pipeline_e2e_test.go), rebuilt the test cluster with the fresh backend images, exercised the focused pipeline_run_parallelism scenario, and then ran the end-to-end suite to confirm the new check passes with the concurrency cap enforced.

Follow up to PR - remove unused semaphore_key and mutex_name fields

sduvvuri1603 avatar Nov 12 '25 21:11 sduvvuri1603

Hi @sduvvuri1603. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-oss-prow[bot] avatar Nov 12 '25 21:11 google-oss-prow[bot]

/retest

alyssacgoins avatar Nov 13 '25 15:11 alyssacgoins

@sduvvuri1603: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-oss-prow[bot] avatar Nov 17 '25 14:11 google-oss-prow[bot]

/ok-to-test

hbelmiro avatar Nov 17 '25 14:11 hbelmiro

/retest

hbelmiro avatar Nov 17 '25 14:11 hbelmiro

@sduvvuri1603 can you please add what this config is suppose to do, to the PR description? and a section about how you;ve validated the functionality.

nsingla avatar Nov 21 '25 18:11 nsingla

/hold

HumairAK avatar Dec 08 '25 17:12 HumairAK

ignore the accidental approve

HumairAK avatar Dec 08 '25 17:12 HumairAK

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please ask for approval from humairak. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow[bot] avatar Dec 09 '25 21:12 google-oss-prow[bot]

/retest

sduvvuri1603 avatar Dec 10 '25 17:12 sduvvuri1603

/retest

sduvvuri1603 avatar Dec 10 '25 17:12 sduvvuri1603

/retest

sduvvuri1603 avatar Dec 10 '25 19:12 sduvvuri1603

/retest

sduvvuri1603 avatar Dec 15 '25 20:12 sduvvuri1603

Thank you for the clarification @sduvvuri1603.

... for persistence/audit purposes only

Is it really a requirement? The parallelism is already persisted in the pipeline spec (which is stored when the version is created). What scenario requires reading from this ConfigMap rather than the pipeline spec?

hbelmiro avatar Dec 16 '25 10:12 hbelmiro

Changing logic to make sure argo reads from configmap and I use the code reference of "Using a semaphore configured by a ConfigMap"

sduvvuri1603 avatar Dec 16 '25 17:12 sduvvuri1603

/retest

sduvvuri1603 avatar Dec 16 '25 21:12 sduvvuri1603

/retest

sduvvuri1603 avatar Dec 18 '25 15:12 sduvvuri1603