efcore Allow opt-out from buffering when a retrying strategy is in use

We currently force buffering whenever a retrying strategy is in use, to ensure consistency as results are (partially) read more than once. There are several scenarios where this may not be necessary:

If an entire block is retried, and the block has no side-effects until it completes successfully. This may be the typical case when strategy.Execute(() => { ... }) is used.
If the user knows that nobody is concurrently updating the database during the operation

Allowing the user to opt out of buffering could significantly improve memory consumption and perf in general.

Dec 17 '20 16:12 roji

Hi @roji, I’m looking into connection resiliency in EF Core and saw the note in the documentation about buffering when a retry strategy is in use and I didn’t quite understand the reasoning. I found this issue while trying to find further info, I hope it’s ok to ask you a couple of questions here as it’s at least partly relevant to it.

The document states “This is done to make sure the same results are returned if the query is retried later.”

Is this saying that the results are cached so that if the same query is repeated on a retry, a database query is not actually performed but the cached results used? This would seem strange to me as if I manually repeated a query I would expect a second database query. If I have misunderstood (which is likely!) are you able to explain in another way? Essentially I do not quite get the reasoning for buffering results and why the possibility of an error and a retry forces that behaviour. If an error occurred during streaming why couldn’t the next retry just stream a fresh query again?

Thanks

Jul 14 '21 19:07 stevendarby

@stevendarby of course it's OK to ask questions.

To understand the problem with streaming, imagine that your application is half-way through reading a resultset from the database, and some transient error occurs. Your application state has already consumed half the resultset and done something with it - maybe it has sent it somewhere else, or similar. If the retrying strategy now retries the query, it will receive a whole new resultset from the server - but what should be done? We can't start handing off rows to the application from the beginning of the resultset, since that would mean it sees the same row twice. We also can't seek to the exact place where the exception occurred in the first execution, because the resultsets may be slightly different (updates may have been committed between the queries). We're stuck.

To address this, EF Core buffers the entire resultset before returning the first row to the application. If a transient error occurred, the query is simply retried and the new resultset is read from the beginning (since the application hasn't yet received a single row).

Hopefully this clarifies things - feel free to post additional questions.

Jul 14 '21 21:07 roji

@roji that has cleared things up a lot for me, thanks.

I think where I’d gone wrong is thinking primarily about a query within a manual retry block with enumeration that is safe to repeat (e.g. simply using client side filtering to build a smaller list). Doing something with the results of a query (buffered or not) in a manual retry would always have to be repeatable because even if the query succeeded something later could fail.

I hadn’t properly considered an execution strategy operation on a query ‘ambiently’ as a single unit of work and where enumeration could have side effects.

I also now much better understand this issue you’ve raised, to allow buffering when it’s safe to do so (the cases I’d primarily been thinking about!). Thanks

Jul 15 '21 06:07 stevendarby

@stevendarby that's very true. When using a manual retry block, it indeed makes sense to avoid buffering since application code is already written in a way which would react well to retries; this is exactly what this issue is about. But manual retry blocks are the rarer option - most cases of using retry strategies simply execute queries/updates without blocks, so they need buffering.

Jul 15 '21 15:07 roji

Consider the inverse (forcing buffering) when working on this. Debatable whether it is useful, but might be worth it for API symmetry if we get it for a discount price. See #10159.

Oct 27 '21 20:10 ajcvickers

I'm trying to disable buffering for a query but the context was instantiated w/EnableRetryOnFailure - is it possible to opt-out of the retry strategy for one query or do we need a whole different context?

Sep 01 '22 18:09 gfody

@gfody Assuming you are using SQL Server you could wrap the query in

new SqlServerRetryingExecutionStrategy(context, maxRetryCount: 0).Execute(() => myQuery);

Otherwise, create a class derived from ExecutionStrategy and set maxRetryCount to 0 or override RetriesOnFailure to return false

Sep 02 '22 16:09 AndriySvyryd

Will it be implemented soon?

Nov 16 '22 16:11 LeszekKalibrate

This issue is in the Backlog milestone. This means that it is likely not going to happen for the 8.0 release. We will re-assess the backlog following the 8.0 release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources.

Nov 16 '22 17:11 roji

Otherwise, create a class derived from ExecutionStrategy and set maxRetryCount to 0 or override RetriesOnFailure to return false

I tried this out with Npgsql provider (using NpgsqlRetryingExecutionStrategy) and IAsyncEnumerable but couldn't get it to work. As far as I could see, QueryCompilationContext, which sets IsBuffering is only created when IAsyncEnumerable starts being consumed (JsonConverter in my case). This means that ExecutionStrategy.Current is set back to null as ExecuteImplementation finishes and ExecutionStrategy from context.Dependencies is used (one set when configuring context).

Snippet of how it looks:

        var strategy = new NpgsqlRetryingExecutionStrategy(ctx, 0);
        return strategy.Execute(query, baseQuery => baseQuery
            .Select(e => transform(e))
            .AsAsyncEnumerable());

I imagine there is no way to work around this, when consumer of IAsyncEnumerable is out of my control? As I understand, enumerable has to be realized and during first MoveNextAsync everything is initialized. Until this feature is implemented I imagine only way to avoid buffering when using IAsyncEnumerable is to have separate DbContext without execution strategy (which would incur additional costs as there would be 2 connection pools, unless npgsql somehow shares them between dbcontext's)?

Dec 13 '22 18:12 Hoffs

We have seen some OOM exceptions in containers with memory limits that tend to be associated with executing a query that returns an IAsyncEnumerable over a sizeable resultset, which eventually led me to this issue.

Am I right in understanding that wiring a query.AsAsyncEnumerable() back through to an endpoint to stream a sizeable response is pointless if EnableRetryOnFailure() has been set on the context - in that there will be no memory savings as the results are now going to be buffered?

My sense is that the majority of transient errors are at the point a database connection is established. Errors in the middle of streaming results are much less common. Therefore, you might well choose to adopt a retry policy that does not buffer and accept that on rare occasions it could blow up in the middle of streaming a response to the API client.

If I wanted to implement a SqlServerRetryingNoBufferingExecutionStrategy, it looks like I could override RetriesOnFailure to return false (as suggested above to prevent buffering) and this will only impact buffering, although that seems wrong as it will be retrying.

Maybe ExecutionStrategy should have an IsBuffering property:

    public virtual bool IsBuffering => RetriesOnFailure;

(and a corresponding IsBufferingExecutionStrategy on QueryCompilationContextDependencies)

and then QueryCompilationContext can set IsBuffering from these?

Jan 21 '23 13:01 frankbuckley

Am I right in understanding that wiring a query.AsAsyncEnumerable() back through to an endpoint to stream a sizeable response is pointless if EnableRetryOnFailure() has been set on the context - in that there will be no memory savings as the results are now going to be buffered?

Not quite. If you do e.g. ToListAsync, the resultset would be stored in memory twice: once internally for the retrying strategy, and once in user code as the list. Changing ToListAsync to AsEnumerableAsync removes the 2nd one, which is quite an improvement (buffer once instead of twice).

I understand the desire to avoid any duplication here... Rather than restricting retries to connecting and giving up on the execution, I'd rather we provided a coding pattern which allows the user to avoid buffering even when execution, e.g. by having the entire code block retried (as suggested in the original post above). That would provide the best of all worlds, I think.

Jan 21 '23 14:01 roji

Thanks for the quick reply!

I agree – the option to avoid buffering by retrying the code block would be a good solution in many cases.

However, if you are IAsyncEnumerable all the way through to the endpoint (no ToArray(), etc.) and an error occurs while reading results, the error will surface in the middle of the response being written to the network stream – when some may have already been sent back to the client – and there is not really any way of recovering from there – the client will need to retry.

I would imagine if you catch an exception in the middle of streaming the results to the response and reexecute the query and then continue, at best you will get repeat entries in your response, but more likely an unreadable response.

In this scenario, it seems you still need to be able to implement a custom ExecutionStrategy that retries on exceptions we can be confident are not in the middle of reading results, but fails otherwise – and has buffering turned off.

Jan 21 '23 14:01 frankbuckley

From what I can see, RetriesOnFailure is only ever used to configure IsBuffering, except for an error check in ExecutionStrategy.OnFirstExecution(), so if that is overridden to ensure the check is still performed, it would be possible to create a retrying-but-not-buffering execution strategy?

(FWIW, I still think a IsBuffering property on ExecutionStrategy - separate from RetriesOnFailure would be helpful here.)

Jan 21 '23 14:01 frankbuckley

Not quite. If you do e.g. ToListAsync, the resultset would be stored in memory twice: once internally for the retrying strategy, and once in user code as the list. Changing ToListAsync to AsEnumerableAsync removes the 2nd one, which is quite an improvement (buffer once instead of twice).

Yes, but if you are IAsyncEnumerable (or IEnumerable, for that matter) through to writing the response, turning on retries increases the memory requirement from 1 row to n rows where n could easily be 50 or 100.

Fundamentally, if you want to stream your results through to the response to minimise memory load, and you want retries to the database (where possible), then you have to accept you can only retry up until you start iterating over the results - and leave retries for failures at this point to the downstream client.

If adding an IsBuffering property to ExecutionStrategy were acceptable, I would be willing to work on a PR to help bring this forward to 8.0.

Jan 21 '23 15:01 frankbuckley

However, if you are IAsyncEnumerable all the way through to the endpoint (no ToArray(), etc.) and an error occurs while reading results, the error will surface in the middle of the response being written to the network stream – when some may have already been sent back to the client – and there is not really any way of recovering from there – the client will need to retry.

Stepping back... At the end of the day, fully streaming and retrying are incompatible concepts. If you're fully streaming (no buffering whatsoever), then you cannot retry reliably, since an exception may have occured in the middle of streaming, and you've already delivered some data to the client. So in order to retry safely, some form of buffering is mandatory. The current situation seems somewhat OK to me: using a retrying strategy and returning an IAsyncEnumerable automatically buffers inside EF Core, ensuring safe retries.

In this scenario, it seems you still need to be able to implement a custom ExecutionStrategy that retries on exceptions we can be confident are not in the middle of reading results, but fails otherwise – and has buffering turned off.

If your goal here is to only retry on connection open failures, you should be able to open the connection yourself before issuing the query, and run that in the retrying strategy only. After that, execute your query without a retrying strategy.

[...] it would be possible to create a retrying-but-not-buffering execution strategy?

Probably, but again, that would be unsafe. If an exception actually occurs, you cannot actually retry since you've already delivered data to the client. So you'd be sending certain results twice.

Yes, but if you are IAsyncEnumerable (or IEnumerable, for that matter) through to writing the response, turning on retries increases the memory requirement from 1 row to n rows where n could easily be 50 or 100.

Yes, that is the price you pay for a safe, retrying strategy. What I wrote was that using AsAsyncEnumerable/ AsEnumerable (as opposed to ToListAsync/ToList) at least avoids buffering twice.

What's your exact proposal here, given the above?

Jan 21 '23 19:01 roji

Thank you, again.

I think we are in agreement re retry options and trade-offs when considering memory use.

The current implementation of ExecutionStrategy ties IExecutionStrategy.RetriesOnFailure to the setting of QueryCompilationContext.IsBuffering – in other words, the decision to buffer is dependant on the decision to retry.

This prevents the implementation of an ExecutionStrategy that aims to retry in certain circumstances but not to buffer.

One use case for this scenario is retrying on errors arising from connections, but not from errors arising from streaming results – in order to successfully stream results in most cases and leave the service client to retry in the (less common) scenario that an error occurs while reading the resultset.

Given this, I would propose that a property is added to IExecutionStrategy:

bool IsBuffering { get; }

and implemented on ExecutionStrategy:

public virtual bool IsBuffering => RetriesOnFailure;

and QueryCompilationContextDependencies amended:

IsBufferingStrategy = executionStrategy.IsBuffering;

...

public bool IsBufferingStrategy { get; init; }

and QueryCompilationContext amended to:

IsBuffering = ExecutionStrategy.Current?. IsBuffering?? dependencies. IsRetryingExecutionStrategy;

This would allow implementations of ExecutionStrategy to return true for RetriesOnFailure while returning false for IsBuffering, therefore enabling them to be designed to retry in certain circumstances but not to buffer.

Jan 21 '23 21:01 frankbuckley

and implemented on ExecutionStrategy:

Sorry, didn't notice SqlServerExecutionStrategy does not derive from ExecutionStrategy, so that will also need to implement:

public virtual bool IsBuffering => RetriesOnFailure;

Thinking about it, it may be better to add a default implementation to the interface.

Jan 21 '23 22:01 frankbuckley

Thanks for the added details. Specifically for enabling retries doing connecting but not during query execution, I proposed the above:

If your goal here is to only retry on connection open failures, you should be able to open the connection yourself before issuing the query, and run that in the retrying strategy only. After that, execute your query without a retrying strategy.

Of course, that requires some custom coding - but you can use that specifically where queries return huge resultsets, and so buffering isn't desired but you still want to retry during connecting.

I think this is the first time we receive a request to have this connection/execution split. You're welcome to open a new issue for it (it's separate from what this issue discusses), though we likely won't get around to it any time soon.

Jan 21 '23 23:01 roji

BTW a problem I see with the specific proposal, is that it makes it easy to execute a query thinking that proper retrying is going on, when it isn't (because buffering is disabled); that's quite a pit of failure for unsuspecting users. If anything, I'd make it very explicit when retrying is actually occuring.

Jan 21 '23 23:01 roji

I think this is the first time we receive a request to have this connection/execution split. You're welcome to open a new issue for it (it's separate from what this issue discusses), though we likely won't get around to it any time soon.

FWIW this issue may already cover that: https://github.com/dotnet/efcore/issues/30023

Jan 21 '23 23:01 stevendarby

With regards to the "double buffering" of retry + ToListing, could EF Core's ToListAsync extension method avoid this by somehow passing through the list used for buffering? This might not be feasible- wanted to float the idea before raising a separate issue to consider it in more detail. (Non-async ToList probably couldn't do this as it's not owned by EF Core.) edit: actually, two lists of the same objects would just be duplicating references and not much of issue, so there is probably something else going on here and this idea won't make much sense!)

Jan 22 '23 08:01 stevendarby

@stevendarby not buffering for ToListAsync (and very theoretically, also for ToList) is covered by #20076. It's definitely something we should do at some point... The asymetry between ToListAsync and ToList on this wouldn't be great, but I agree we shouldn't block making ToListAsync better just because of that.

Jan 22 '23 12:01 roji

You're welcome to open a new issue for it (it's separate from what this issue discusses)

#30112

Jan 22 '23 12:01 frankbuckley

BTW a problem I see with the specific proposal, is that it makes it easy to execute a query thinking that proper retrying is going on, when it isn't (because buffering is disabled); that's quite a pit of failure for unsuspecting users. If anything, I'd make it very explicit when retrying is actually occuring.

Agreed. I don't think I would want an extension method on SqlServerDbContextOptionsBuilder - users should have to explicitly create a (say) SqlServerRetryingNoBufferingExecutionStrategy for which all the caveats would be documented.

That said, today retries are to some extent a pit of failure. Admittedly, I cannot have read the docs since they were updated a couple of years back, but buffering when I thought we were streaming caught me by surprise.

Jan 22 '23 12:01 frankbuckley

I think increased memory usage isn't quite the same kind of pit of failure as a retry which doesn't buffer, and ends up delivering duplicate results (or whatever)...

Jan 22 '23 13:01 roji

I think increased memory usage isn't quite the same kind of pit of failure as a retry which doesn't buffer, and ends up delivering duplicate results (or whatever)...

True

Jan 22 '23 13:01 frankbuckley

Relates to https://github.com/npgsql/efcore.pg/issues/2725

Apr 25 '23 05:04 mu88

efcore efcore copied to clipboard

Allow opt-out from buffering when a retrying strategy is in use

efcore
efcore copied to clipboard