orleans
orleans copied to clipboard
Durable Jobs follow-up
This issue documents follow-up items for Durable Jobs (#9717)
- [x] Rename to Durable Jobs (update projects, types, etc)
- [ ] Rename
ScheduledJobContexttoScheduledJobRun(https://github.com/dotnet/orleans/pull/9717#pullrequestreview-3394912141) - [ ] Address 50K append/blob limit for Azure Storage implementation (https://github.com/dotnet/orleans/pull/9717#pullrequestreview-3394719873)
- [x] Use
IOverloadDetectorinLocalScheduledJobManagerto throttle execution when the host is overloaded (https://github.com/dotnet/orleans/pull/9717#pullrequestreview-3394850102) - [ ]
IScheduledJobReceiverExtension.DeliverScheduledJobAsyncshould return some kind ofScheduledJobRunResultwhich includes aTimeSpanPollAfterproperty so that long-running requests can be better supported. - [ ] Flow
CancellationTokenin tests, so we can have a short (eg, 2-min) timeout for each test. - [ ] Make the non-
CancellationTokenarguments passed toILocalScheduledJobManager.ScheduleJobAsynca class or struct to make it easier to add properties later without breaking existing callers/impls. - [ ] Idea: Support multiple concurrent accounts in
AzureStorageJobShardManagerfor improved scaling, migration, etc (https://github.com/dotnet/orleans/pull/9717#pullrequestreview-3404983204) - [ ] Observability - Do a pass on tracing, metrics, logs
- [ ] Rebalancing - We need to de-assign/rebalance shards if we have too many to avoid skew after a new deployment or upgrade. (https://github.com/dotnet/orleans/pull/9717#pullrequestreview-3405930327)
- [ ] Concurrent shard limit - We should limit the number of concurrently assigned shards per silo to prevent memory exhaustion. (https://github.com/dotnet/orleans/pull/9717#pullrequestreview-3405930327)
- [ ] Shard assignment slow start - We should consider performing a slow start for shard assignment, only reading a number of shards based on how long the silo has been up. It's important for disaster recovery scenarios, as we have seen with Azure ML. (https://github.com/dotnet/orleans/pull/9717#pullrequestreview-3405930327)
- [ ] Concurrent job slow start - We should gradually increase job concurrency (semaphore.Release) during startup until we hit our target. This helps to avoid starvation issues which can happen before things have warmed up (caches, connection pools, thread pool sizing, etc). (https://github.com/dotnet/orleans/pull/9717#pullrequestreview-3405930327)
- [ ] Make sure we handle multi-cluster deployments more gracefully in AzureStorageJobShardManager (and other impls, ideally) (https://github.com/dotnet/orleans/pull/9717#pullrequestreview-3404974478)
@ReubenBond Regarding the item "Rename to Durable Jobs (update projects, types, etc)", I am confused, what about the backward compatibility with Reminders V1?
Until now, I assumed that both Durable Jobs and Reminders would live side by side, so that users can migrate their existing reminders progressively. I assumed that the new name was meant to make this co-existence easier.
@nkosi23 Durable Jobs & Reminders will remain separate, so they can run side-by-side and you can gradually migrate from one to the other.