Flow runs waiting for deployment concurrency limit congest work queue
Bug summary
When starting more than 200 + <limit value> flow runs with a deployment concurrency limit, all other flow runs that are started afterwards will directly transition to status Late and only start running when the number of scheduled flows falls below 200 even if they have no concurrency limit attached.
The prefect worker logs repeat themselves like this:
15:31:55.282 | DEBUG | prefect.utilities.services.critical_service_loop - Starting run of 'get_and_submit_flow_runs'
15:31:55.283 | DEBUG | prefect.worker 4f69358c-286a-4b99-bef4-f621f04a3c0b - Querying for flow runs scheduled before 2025-11-07T23:32:05.283117+00:00
15:31:56.071 | DEBUG | prefect.worker 4f69358c-286a-4b99-bef4-f621f04a3c0b - Discovered 200 scheduled_flow_runs
15:31:56.072 | INFO | prefect.flow_runs.worker - Worker '4f69358c-286a-4b99-bef4-f621f04a3c0b' submitting flow run '068bef84-53a5-71e7-8000-b9af0ea9c9fc'
15:31:56.073 | INFO | prefect.flow_runs.worker - Worker '4f69358c-286a-4b99-bef4-f621f04a3c0b' submitting flow run '068beffe-50dd-7316-8000-3f8f62ce3c45'
15:31:56.074 | INFO | prefect.flow_runs.worker - Worker '4f69358c-286a-4b99-bef4-f621f04a3c0b' submitting flow run '0690e2cd-9f10-7919-8000-0acfb999e734'
... 197 more
15:31:56.989 | DEBUG | prefect.utilities.services.critical_service_loop - Starting run of 'sync_with_backend'
15:31:57.236 | INFO | prefect.flow_runs.worker - Aborted submission of flow run '0690e2f0-770b-7434-8000-8507a4910b67'. Server sent an abort signal: Deployment concurrency limit reached.
15:31:57.240 | INFO | prefect.flow_runs.worker - Aborted submission of flow run '0690e30a-992c-739d-8000-c37e08a2e7fe'. Server sent an abort signal: Deployment concurrency limit reached.
15:31:57.243 | INFO | prefect.flow_runs.worker - Aborted submission of flow run '0690e343-088d-7169-8000-5b0203be27fa'. Server sent an abort signal: Deployment concurrency limit reached.
... 197 more
This suggests that the worker fetches only 200 scheduled flow runs, with all of them belonging to the deployment that has a concurrency limit attached to it. It seems that all other scheduled flow runs that were submitted afterwards without a concurrency limit and could actually run are never seen by the worker until less than 200 flow runs with concurrency limit are scheduled.
This means that running a high number of flows with deployment concurrency limits blocks the work queue for all other deployments until the number of scheduled runs falls below the 200 threshold.
Version info
Version: 2.20.16
API version: 0.8.4
Python version: 3.10.14
Git commit: b5047953
Built: Thu, Dec 19, 2024 10:55 AM
OS/Arch: linux/x86_64
Profile: default
Server type: cloud
Additional context
No response