prefect
prefect copied to clipboard
Agent: Add limit to control number of concurrent flow runs
Added a --limit option on 'prefect agent start' to limit the number of flow runs an agent will start simultaneously.
Example
prefect agent start -l 4
prefect agent start --limit 4
Checklist
- [ ] This pull request references any related issue by including "closes
<link to issue>
"- If no issue exists and your change is not a small fix, please create an issue first.
- [ ] This pull request includes tests or only affects documentation.
- [ ] This pull request includes a label categorizing the change e.g.
fix
,feature
,enhancement
Deploy Preview for prefect-orion ready!
Built without sensitive environment variables
Name | Link |
---|---|
Latest commit | f008732185e7cd2731a3eba7e905dd68e87b0fcb |
Latest deploy log | https://app.netlify.com/sites/prefect-orion/deploys/638631142e89bc0008e57f45 |
Deploy Preview | https://deploy-preview-7361--prefect-orion.netlify.app/api-ref/prefect/agent |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site settings.
Hi! Any news on the state of this PR? Having a limit on the agent level is a really nice feature that my projects could benefit as well
@eudyptula thanks for the pull request! This looks reasonable to me. We may want to clarify that this is a limit on running flows as we could also introduce a separate limit on concurrent submissions. We'll need tests before this can be accepted.
@madkinsz The agent should run no more than N flow runs concurrently with this new parameter. It will not take more flow runs from the work queue, than it can handle. Meaning that the remaining (scheduled) flow runs will stay in the queue for any other agent to pick up. At least if I made the update as I intended :D seems to work for us though.
Hope it answers your question?
Came out of this question on slack, btw.: https://prefect-community.slack.com/archives/CL09KU1K7/p1666864461860479
Yep it looks like that's what it does :)
We could also release capacity limit slots after task_status.started()
is called which would make the limit N flow runs submitting concurrently. Submission concurrency could be useful for some users as well and we'll want to make sure it's clear that the --limit
here applies to the full life cycle of the flow run infrastructure so when/if we add another type of limit the interface is still intuitive. I'm thinking ahead to a name for the submitting limit e.g. --submitting-limit
and wondering if we should do something similar here e.g. --running-limit
, but I think --limit
is actually more accurate in this case.
@madkinsz Not sure I completely understand what submission limit is actually limiting...
@eudyptula Well the process of creating infrastructure can be rate limited by cloud providers so you may want to have that concurrency limited. Anyway, I wouldn't worry about that.
If you add tests here, we can accept this :)
@madkinsz: I doubt it, but just realised that we may have to clarify if this update could somehow cause containers to not be shut down on crashes (in regard to my issue https://github.com/PrefectHQ/prefect/issues/7560)
@madkinsz:
- Fixed unittests so they should actually run now
- Realised that the agent may not take the flow runs in a sensible order if there're more than one queue, so added a sort on next_scheduled_start_time
- Wasn't a little unsure about this, but shouldn't get_and_submit_flow_runs() only return a list of the runs it actually submitted? Hope the filter I added makes sense for this. As far as I saw the return value is only used in unittests.
- Finally, I added a quick unittest. Wasn't sure how to get the agent to actually take the flows out of the queue, so right now the submitted_flow_run_ids list just seems to continue increasing within the test (the assert account for this however).
@madkinsz is "limit on concurrent submissions" in pipeline? This feature would be really useful.
@MuFaheemkhan it should be trivial to add following this pattern if it works well.
@eudyptula just fyi, I rebased onto main
to resolve the conflict
Edit: ~Looks like it stole ownership of your commits... I'll try to fix that.~ The original author is restored now.
I want to add that I have been using this PR directly for a couple of weeks and I can confirm it works the way it is supposed to! :) Also I noticed that https://github.com/PrefectHQ/prefect/pull/7362 could supplement this feature really well, like adding a tag that helps us identify the agent where the flow was executed.
@andreas-ntonas, Happy to hear you're finding it useful :)
@madkinsz, Just glad to see it making it through :) Looking forward to ditch my fork and use the official releases again, and when the network issues are resolved, hopefully we can finally complete our migration to Prefect 2.
@eudyptula while the real issue is buried deep in some http core libraries, #7593 should alleviate all of the issues for ya'll :)
Hey @madkinsz is this released for public use? If not, what is the timeline for its release? This seems like an important feature. Thanks :)