orleans
orleans copied to clipboard
[EPIC] Papercuts to fix for 4.0.0
This is an issue to log and track papercuts: small annoyances/inconveniences which developers often run into. Please feel free to add to the list and to pick off items on the list to fix.
- [ ] Silo shutdown takes too long, even in dev/test scenarios with a single silo and no persistence
- [x] Logging noise:
MemoryStorageGraingrain logsOnActivateAsyncmessage - [x] Logging noise: On shutdown, silo spews a lot of information which is not meaningful to developers
- [x] Logging noise: Gateway refresh notifications on client are unnecessary
- [ ] Logging noise: Periodic statistics dumps are unnecessary / unexpected
- [x] Logging noise: deployment load publisher
- [ ] Logging noise: Logging that
CleanupDefunctSiloEntriesis not supported is unnecessary, especially as a warning. We should also not throw an exception and should implement it on most/all providers - [ ] Logging noise: "No implementation of IHostEnvironmentStatistics was found. Load shedding will not work yet"
- Add config validator to only warn/error if
LoadSheddingOptions.LoadSheddingEnabledis true
- Add config validator to only warn/error if
- [ ] Logging noise:
VirtualBucketsRingProvider - [x] Logging noise:
Counters - [x]
ClusterId&ServiceIdshould have a configured default, eg"default" - [x]
ISiloBuildervsISiloHostBuilderis confusing - deleteSiloHostBuilderentirely and steer everyone towards generic host - [ ] The distinction between
Grain<T>andGrainwithIPersistentState<T>injected could be confusing - we should consider removingGrain<T> - [x] Application Parts should never need configuring (Fixed in #7070 by removing app parts entirely)
- [ ] Orleans calls issued before the silo/client has fully initialized should either throw or wait until startup has completed - waiting is probably nicer UX, as long as there's still a timeout to say that it's waiting for initialization.
- [ ] Introduce
ObserverManageras a utility to help with observers and update docs again - [x] Rename the Consul package to
Microsoft.Orleans.Clustering.Consuland clean up package metadata - [x] Rename the ZooKeeper package to
Microsoft.Orleans.Clustering.ZooKeeper& clean up metadata - [x] #7456
- [ ]
Microsoft.Orleans.ClientandMicrosoft.Orleans.Servershould pull inMicrosoft.Orleans.Sdkto reduce scope for errors.
May adding support for stateless worker grains in heterogeneous silos be considered as papercut? :)
@krasin-ga yes, it would. Which version are you running? I recall that we fixed an issue with StatelessWorker grains in heterogeneous clusters
I am using version 3.4.2.
A little preface. I use the concept of "roles" to segregate the responsibilities of silos. So a silo with a certain role places only those grains that belong to it. Something like a set of micro clusters within a cluster. This allows for flexible resource management in silo deployment, faster localization of problems, and an additional level of isolation.
But this approach has a few drawbacks.
-
This is the problem I was referring to. The documentation says that statless worker grains must be deployed on all silos. I got around this by using grains that handle one request and then get deactivated using the DeactivateOnIdle method. I didn't realize this could already be fixed :)
-
The grain catalog is common to all silos and restarting or crashing a silo with one role affects silos with other roles. Some of the grains are deactivated. Part of the requests may be lost #6746. Both client to silo and silo to silo. It may cause some downtime when a large ResponseTimeout has been set.
-
There were problems when starting clients and silos in a certain order. Clients kept not seeing certain grain implementations even though all the silos were already running by then. Maybe this was only within the time frame set by TypeMapRefreshInterval, I'll check that later.
It would be nice to be able to register a cluster client as a service rather than building it as a host. This would be more in line with its responsibilities. Right now, to register a client inside another host (like asp .net core), I use UseServiceProviderFactory and remove IHostApplicationLifetime along with logging services so they don't overwrite my registrations. Then IHostedService is used to initiate the connection of the client to the cluster.
I also propose to make IClusterClient a generic type: IClusterClient<TCluster>. This would make it fairly easy to register and use multiple cluster clients within the application, including within the silo.
@krasin-ga yes, it would. Which version are you running? I recall that we fixed an issue with StatelessWorker grains in heterogeneous clusters
I've checked, and everything actually works great. Thanks!
It would be nice to be able to register a cluster client as a service rather than building it as a host.
Maybe. The client is self contained. What I usually do, is to build an IClusterClient and register it on the generic Host DI container. After that, I register an IHostedService which injects this client and start/stop it. That way, when the application (i.e. ASP.Net) is starting, it will start the client and shutdown it gracefully with the shutdown.
May I suggest the good old timer reentrancy issue to be fixed? https://github.com/dotnet/orleans/issues/2574
@Xanatus Thank you, good suggestion
I register an
IHostedServicewhich injects this client and start/stop
@galvesribeiro
One of the things I've raised several times on the Discord is how many folks are looking for prescriptive guidance and best practices.
One of the great things about Orleans is how flexible it is and how it can be shaped to meet your exact use case and that probably has to do with not having lots of opinionated aspects to Orleans.
As Orleans gains additional momentum and adoption from the visibility of now being under the .NET Development Group, I think there may be a benefit to offering some level of guidance that meets the 80%.
A great example of this is the client connection story. Having to roll your own all the time can lead to subtle mistakes and gaps in the implementation.
It would be nice to have something out of the box that can just be configured and injected.
I haven't dug into 4.0 deeply yet so I'm not sure how much of this story has changed; please forgive me if this already exists.
I agree with you @ElanHasson.
There are some rework on the client itself of which it wont need its own DI container anymore and can be added directly to any IServiceCollection however, the initialization of the client still requires a an async call.
We have two options here:
- Pay the cost for the initialization on the first grain invocation transparently: This is "easier" but brings some complication to the table like queuing all the calls until the first one gets the client initialized and then starts dequeuing and invoking all others or any other synchronization mechanism;
- For applications that run using the Microsoft.Extensions.Hosting library we can just install an internal
IHostedServicewhich connects the client before all other services even start: This is what I manually do Today but this also wouldn't work if you are not using that library.
Do you have any suggestion what would be the best approach?
btw here is an example of what I have to do: https://github.com/web-scheduler/web-scheduler-api/blob/main/Source/WebScheduler.Client.Core/HostedServices/ClusterClientHostedService.cs#L78-L115
I'm not even sure how correct that is, but it works. (i know I have primitive retry handling there.)
@galvesribeiro. The Orleans strength is in it's utility.
I like option #2.
A IHostedService that can take the client builder as configuration and handles everything for you needed for managing the IClusterClient. Something like this as a NuGet would be perfect.
The remaining items on this list have been migrated to https://github.com/dotnet/orleans/issues/8257
Closing this - please bring up any relevant papercuts in the new thread.