orleans
orleans copied to clipboard
Orleans client fails to load contract dll randomly
I have a small example/POC I'm working on where I want to demonstrate how orleans can make life easier and also more robust.
If you want to go straight to the code it is here: https://github.com/mastoj/monostore/tree/random-error
The setup is that I have an API as a Orleans Client, and then I have two different workers as silos, one for cart and one for product. I also have an API project for cart and product that the main API project references to keep the cart api definitions close to the rest of the cart implementation. The cart/product API as then referencing their own contract folder which defines the contracts for the API and grains.
So basically I have the below for cart (same for product):
API (Client) -> Cart API -> Contract
Cart Worker (Silo) -> Contract
When I make a call to the API to create a cart, https://github.com/mastoj/monostore/blob/a9abe53968a5ee17acc929c86a36ea29e8fc7cfd/src/cart/requests/requests.http#L5, it fails randomly with the exception
System.TypeLoadException: Unable to load MonoStore.Cart.Contracts.Grains.ICartGrain,MonoStore.Cart.Contracts from assembly MonoStore.Cart.Contracts ---> System.IO.FileNotFoundException: Could not load file or assembly 'MonoStore.Cart.Contracts, Culture=neutral, PublicKeyToken=null'. The system cannot find the file specified.
The exception is thrown in the API project, so the request never reaches the silo.
Everything is set up with Aspire, but I don't think that should impact how dlls are loaded.
I can reproduce this. Thank you for putting it together. This is a limitation of heterogenous clusters currently. The workaround is to add all contract assemblies to all silos (all gateways, which is all silos in this case).
The limitation is at the RPC layer. I have a branch to fix this, but it's not in a mergeable state just yet.
After adding the Cart contract reference to the Product service, the request completes successfully:
The reason that it works sometimes even without this is that the client might send the request to a compatible gateway. cc @benjaminpetit: we could change client routing to pick compatible gateways while we prepare the true fix.
Thanks for quick response.
To clarify, I only need to reference the contract, not the grain implementation?
Yes, that's correct
Had some issues with the orleansdashboard, my guess is that it is related.
I actually have the issue from time to time even after adding the references. I have added a reference to all contracts projects to all my silos and the api. Some times it does work, then all of a sudden it fails. The changes can be seen in this PR: https://github.com/mastoj/monostore/pull/1/files
@ReubenBond , do you maybe know why I still see it. I find it very confusing because after starting up the cluster it can fail for a couple of requests, but when changing the id of the cart I try to create a couple of times it start working.
@ReubenBond I just hit this too, and I have a homogenous environment.
My issue is not solved reliably. If I wait a little bit it start working most of the time.
My bad, actually my issue relates to #8200
@ReubenBond is it expected that the workaround only would work after a while... I still get the error, but after a couple of seconds the implementation is found.
@ReubenBond I have a working sample for placement filters where this scenario happens as well in a homogenous environment. My test code is not in a repo but I can provide it to you if you wish.
Is it related to exception deserialization on external client? I had exceptions about redis storage when I turned it off, but client doesn't know anything about redis and I got TypeLoadException.
@ReubenBond here is my experimental/demo code where this scenario of random TypeLoadExceptions happens https://github.com/miguelhasse/orleans/tree/demonstration_playground/samples/Orleans-Batch-Processing-on-Aspire