orleans icon indicating copy to clipboard operation
orleans copied to clipboard

Overly long traces for `IAsyncEnumerable`

Open Tragetaschen opened this issue 3 months ago • 4 comments

I'm currently looking at a partial trace from production that was rejected because it grew beyond our configured limits. That particular example shows a runtime of >40 minutes with >9000 individual spans.

After the initial pair of

orleans-client IAsyncEnumerableGrainExtension/StartEnumeration<T> (504.6μs)
orleans-server IAsyncEnumerableGrainExtension/StartEnumeration<T> (54.2μs)

all other spans are subsequent pairs of

orleans-client IAsyncEnumerableGrainExtension/MoveNext<T> (196.39ms)
orleans-server IAsyncEnumerableGrainExtension/MoveNext<T> (196ms)

with up to 15s runtime based on the IAsyncEnumerable heartbeat.

In other parts of the application, we frequently create new root activities linked to a common trace to split up long running operations. What are our options here?

Tragetaschen avatar Oct 09 '25 16:10 Tragetaschen

We are running client and server with .AddActivityPropagation() and this is what continuously adds new spans to the trace while the async enumerator is running. For the time being, we have set Activity.Current = null before entering the async loop on the client and this successfully stops publishing new spans. This, however, cuts off the any server-side processing from the trace, which was fine four our scenario, because the server just exposes a ChannelReader without trace-worthy logic.

I can think of at least two things that can improve this:

First, the activity propagation could gain a configurable filter so applications can decide based on the grain call context, whether the activity propagation should be active. That would have allowed in this scenario to generally disable IAsyncEnumerableGrainExtension.

Second, the activity propagation could special-case IAsyncEnumerableGrainExtension and instead of creating a child span for each individual call, create a new root activity with only a link to the existing one. This mimics how SignalR is managing activities for individual hub calls on an existing connection.

Tragetaschen avatar Oct 15 '25 09:10 Tragetaschen

@Tragetaschen are you saying that SignalR creates a new span for each element in an IAsyncEnumerable request to a hub?

I'd be ok with us doing the same thing here. It would mean that the traces will lose causality, unless there's another way to relate the spans back to the original parent through some metadata on the trace.

I prefer we do something universal rather than add an option, if there is a good option.

ReubenBond avatar Oct 22 '25 20:10 ReubenBond

We don't add activities for IAsyncEnumerable currently. See https://github.com/dotnet/aspnetcore/pull/55439#discussion_r1585343314 for a conversation about it, where we ended up deciding against adding any traces for now.

It would mean that the traces will lose causality, unless there's another way to relate the spans back to the original parent through some metadata on the trace.

For this part, SignalR (and Blazor) use Links. We create a new Activity without a Parent and set a Link that points to the Parent.

BrennanConroy avatar Oct 22 '25 20:10 BrennanConroy

@BrennanConroy thanks for the context. These traces are sometimes useful, but if they are more noise than signal then let's remove the traces from IAsyncEnumerable<T> entirely for now.

@Tragetaschen how does that sound to you?

ReubenBond avatar Oct 22 '25 20:10 ReubenBond