hatchet icon indicating copy to clipboard operation
hatchet copied to clipboard

feat: optimistic scheduling

Open abelanger5 opened this issue 3 months ago • 1 comments

Description

Adds support for "optimistic" scheduling, meaning that if we can create tasks from the gRPC engine with transactional safety, and schedule tasks on workers which are connected to the current gRPC session (these are two separate concepts, referred to in code by localScheduler and localDispatcher). We allocate a small set of semaphores for that.

Features:

  • Up to a 3x speedup in scheduling performance, from 24ms -> 8ms for single-task workflows
  • Reduces overall pressure on the message queue and downstream components, as there are fewer messages being passed for scheduling purposes
  • Made some improvements to listening for a task completed event for a single-task workflow by hooking into an existing tenant message task-completed. We can similarly add task-failed and task-cancelled in the future.

Drawbacks:

  • Increases the complexity of scheduling as the paths for optimistic scheduling are quite different from the regular path, since we protect everything with a single transaction
  • Can increase pressure on the engines. I've tried to avoid major issues by only allocating 10 "scheduling slots" to each gRPC process (configurable via an env var)

Limitations:

  • Scheduling child workflows is still significantly slower than scheduling non-child workflows, because we have ~6ms of latency due to how we're checking idempotency on the child workflow trigger
    • I think we could improve this a lot with idempotency keys, it really should only be a single database transaction to insert/lookup the idempotency keys
  • This won't be turned on in HA mode and as n engines are horizontally scales the chances of optimistic scheduling reduce by 1/n - we only use local schedulers when they have a lease on a tenant. We will need to build out a sticky load balancing strategy to take advantage of optimistic scheduling in HA setups.

Type of change

  • [X] New feature (non-breaking change which adds functionality)

abelanger5 avatar Sep 07 '25 16:09 abelanger5

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
hatchet-docs Ready Ready Preview Comment Sep 8, 2025 1:32pm
hatchet-v0-docs Ready Ready Preview Comment Sep 8, 2025 1:32pm

vercel[bot] avatar Sep 07 '25 16:09 vercel[bot]