Configuration option for `OptimizationPreference`
Current codegen is really well suited for what I would call traditional servers/sustained throughput. So we are eagerly caching a lot of stuff so that the process can hit the ground running as requests start to come in. So in default configuration, this is what happens:
- Process starts up
- DI registrations run (registers all services, handlers, behaviors etc...)
- First reference to
IMediatorwill instantiateContainerMetadatawhich will load all the caching for all messages/handlers - All messages passing through Mediator will be very fast
AOT on the server side makes a lot of sense with serverless workloads where the process is shortlived and we might prefer more lazyness/less caching. I don't have much experience in this area but I assume that we might want the lifecycle to look more like this
- Process starts up
- DI registrations run (registers all services, handlers, behaviors etc...) - we don't know which services we potentially don't need
- 1 HTTP request recieved
- 1 Mediator message dipatching - lazily resolve handlers for this specific message and nothing else
- Process shuts down (after serving response)
We could have an enum on MediatorOptions that looks something like this
enum OptimizationPreference
{
SustainedThroughput,
ColdStart,
}
Very unsure about the naming here, tried to think of concepts that are more or less timeless, but I don't think SustainedThroughput makes any sense, heh
I agree with your summary and analysis.
AOT on the server side makes a lot of sense with serverless workloads where the process is shortlived and we might prefer more lazyness/less caching. I don't have much experience in this area but I assume that we might want the lifecycle to look more like this
My only suggestion here is that from our use case, this might be premature optimization. Yes, you're absolutely right that our serverless workload is indeed short-lived, which is likely the most common scenario.
It seems that projects that have a huge number of DI registrations wouldn't be a good match for cloud serverless functions, and as such is it possible that the effort to lazy load only the required handlers wouldn't provide a significant benefit to performance. For example, our most recent project using Mediator has a total of only four handlers.
It's worth measuring, but my suspicion is that cold start perf won't be significantly impacted by lazy-loading / caching of handler registrations etc.
If you do decide to implement lazy loading, then maybe having a third option might be beneficial? Naming is hard, but maybe something like
enum LazyLoading
{
Off,
On,
OnWithCaching
}
Purely a suggestion. :)
Anyway, we're excited to hear you're considering source gen to workaround the open generics issue. That saves us the maintenance issue of manually registration.
Good points!
It's worth measuring, but my suspicion is that cold start perf won't be significantly impacted by lazy-loading / caching of handler registrations etc.
Yes I think this is the next step for this issue. I actually think we should have a separate benchmark which measures all the way from Main to first response so that we know the overhead for this specific workload
It seems that projects that have a huge number of DI registrations wouldn't be a good match for cloud serverless functions, and as such is it possible that the effort to lazy load only the required handlers wouldn't provide a significant benefit to performance.
As a counterexample, we have a production app running with 8 Lambda functions, and the largest one includes over 100 handlers. As far as I know, as long as performance and memory is manageable, there's no limitation in a serverless environment that prevents you from having many handlers within a single Lambda. The recommended practice is to organize Lambdas based on domains or the types of tasks they perform, rather than strictly by their size.
So yeah, Iām definitely on board with this kind of setup.
I'm not sure why this would need to be an option. What is the benefit of creating all of the handlers at the same time? To me it seems like the library should just always be in the pay for what you use mode. One time hit the first time a handler is used to build up the handler but after that it's built and cached and fast.
Is this possibly to do with handler lifetime? If it's singleton you are expecting it to be created on app startup and it's scoped or transient you aren't?
What is the benefit of creating all of the handlers at the same time?
In the opposite case you incurr the cost of having to check that fact every time you get a message (wether or not the handlers/pipeline is built). So I still believe for long running applications it's a worthwhile tradeoff to built everything on startup. I have not seen techniques that manage to completely avoid that cost, but open to suggestions
Is this possibly to do with handler lifetime? If it's singleton you are expecting it to be created on app startup and it's scoped or transient you aren't?
Might be enough, the guidance would then be to choose transient/scoped if you run serverless (or similar) workloads. If the serverless instance could still receive multiple requests you still incurr additional cost though (as opposed to singleton + lazily loaded)
What is the benefit of creating all of the handlers at the same time?
In the opposite case you incurr the cost of having to check that fact every time you get a message (wether or not the handlers/pipeline is built). So I still believe for long running applications it's a worthwhile tradeoff to built everything on startup. I have not seen techniques that manage to completely avoid that cost, but open to suggestions
Wouldn't that be a very cheap check though? Anyway, I was just wondering. No big deal either way.
Another question for you. Will you be releasing 3.0 soon? It seems like it's in pretty good shape.
+1 on the 3.0 release, we've found it stable too.
Hey everyone! Given the lack of response here, I decided to start a new project with ideas I've had about what my ideal mediator library would look like.
I've been working on Foundatio.Mediator - a blazingly fast, convention-based C# mediator that uses source generators and C# interceptors to get as close to direct method call performance as possible.
šÆ What Makes It Different
- š Fast: Only slight overhead compared to calling methods directly
-
ā” Zero Boilerplate: No interfaces, no base classes - just name your stuff
*Handlerand you're done - š§ DI Just Works: Full Microsoft.Extensions.DependencyInjection support out of the box
- šŖ Rich Middleware: Before/After/Finally hooks that can short-circuit or pass state around
- š Cascading Magic: Return tuples and extra values get auto-published as new messages
- š Compile-Time Safety: Source generator catches problems at build time, not runtime
- š¦ Multi-Project Ready: Works great with vertical slice architecture
š§© Dead Simple Example
public record Ping(string Text);
public static class PingHandler
{
public static string Handle(Ping msg) => $"Pong: {msg.Text}";
}
// That's it! No interfaces, no registration drama
var reply = mediator.Invoke<string>(new Ping("Hello"));
Would love to get some eyes on this from folks who've been thinking about mediator patterns. What do you think - does this approach resonate with anyone?
Check it out:
- š¦ NuGet Package
- š GitHub Repository
- š Docs & Examples
For sure, this issue will probably go into a minor bump. Updated the v3 issue: https://github.com/martinothamar/Mediator/issues/98#issuecomment-3153810976
@ejsmith again appreciate the work you've done there, but as mentioned in the v3 issue I'd rather have a spot in the README where we recommend (great) alternatives š
Interested to know if there's any WIP for this functionality/if there's anything I can help out or get involved with. I'm running a project using Mediator on AWS Lambda and observing a large cold start hit which seems to be coming from the exact issue described here (though it's hard to be certain) - thus I'm quite highly motivated to help get this moved forwards if possible.
I've started prototyping for this, it's the next thing on my list. It's going to come in a 3.1-preview release. I'll post back when I have something testable
I have a working solution here: https://github.com/martinothamar/Mediator/pull/238 Singleton resuts look good, but there is some stuff going on with Scoped/Transient that is kind of puzzling atm
I am no longer puzzled and have merged the PR mentioned above š Will summarize the results here (there are a lot more details in the linked PR for those that are curious):
| Lifetime | Mode | Before | After | Improvement |
|---|---|---|---|---|
| Singleton | Eager | 28.1ms | 20.1ms | 28.5% faster |
| Singleton | Lazy | 28.1ms | 3.5ms | 87.5% faster |
| Scoped | Eager | 9.7ms | 5.9ms | 39.2% faster |
| Scoped | Lazy | 9.7ms | 3.3ms | 66.0% faster |
| Transient | Eager | 9.8ms | 5.9ms | 39.8% faster |
| Transient | Lazy | 9.8ms | 3.4ms | 65.3% faster |
For reference, here are some baselines:
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
Baseline C |
0.1 ± 0.0 | 0.1 | 1.7 | 1.00 |
Baseline Native AOT |
1.5 ± 0.1 | 1.3 | 2.3 | 15.48 ± 3.32 |
- Baseline C = statically linked C "hello world" executable (best case on my hardware)
- Baseline Native AOT = default Native AOT .NET 8 C# project, which sends a request but doesn't use Mediator (best case for Native AOT default config on my hardware).
Note that the benchmarks (not the baseline ones) are for "large project" configurations with Native AOT compilation. Number of messages in the hundreds. For small projects (like a couple of messages), there are essentially no differences. If you look at the codegen for Eager in the ContainerMetadata, that is essentially what is being fixed with Lazy - it is no longer generating a bunch of dictionaries/frozendictionaries for every message type on initialization. But for very small projects the cost of this initialization is likely to drown in the OS and .NET runtime cost of starting a process.
The README as been updated, would appreciate any feedback from folks interested here. Keep in mind that containerization and virtualization will change things in terms of the specific numbers here, you should do your own benchmarking if these perf characteristics are important to you.
Just ran a preview release, should be indexed any minute: https://www.nuget.org/packages/Mediator.SourceGenerator/3.1.0-preview.5