orleans icon indicating copy to clipboard operation
orleans copied to clipboard

Encountering Orleans.Runtime.OrleansMessageRejectionException

Open Ramanth opened this issue 1 year ago • 3 comments

we are using orleans version 3.2.2 and recently ran into an exception that looks as below

Orleans.Runtime.OrleansMessageRejectionException: Forwarding failed: tried to forward message Request S**...:11111:456892757grn/4CD39E8A/00000000+b4752f|||57D29D6->S...:11111:456892757grn/***/*******************[ForwardCount=2] for 2 times after EnqueueRequest - blocked grain to invalid activation. Rejecting now.

This is not very frequent, can someone help us understand more about this and also any suggestion on how to handle such in future.

Ramanth avatar Jul 01 '24 13:07 Ramanth

Hi @Ramanth, it's difficult to say precisely, without looking at a memory dump (or !dumpasync or VS's Parallel Stacks > Tasks view), but it looks like your grain is in a hard deadlock. This could be caused by sync-over-async (eg, calling Wait() on an incomplete task). If you can capture a memory dump of the process with the stuck activation, I can direct you through the diagnostics process.

The "EnqueueRequest - blocked grain" message you're seeing is emitted here: https://github.com/dotnet/orleans/blob/b24e446abfd883f0e4ed614f5267eaa3331548dc/src/Orleans.Runtime/Core/Dispatcher.cs#L473

I'd also recommend upgrading to 3.7.x if you can, or (even better) 8.x.

ReubenBond avatar Jul 01 '24 15:07 ReubenBond

sure @ReubenBond and thanks for the response. we have this service running on a ECS Cluster, so ill need to check how i can get the needed dump information and get back here.

Ramanth avatar Jul 01 '24 16:07 Ramanth

In the meantime, you could look for suspicious code (eg, code performing synchronous blocking), or other suspicious log lines. There are some added diagnostics in later versions of Orleans (and many other improvements), which could also help.

ReubenBond avatar Jul 01 '24 17:07 ReubenBond