aspnetcore icon indicating copy to clipboard operation
aspnetcore copied to clipboard

Kestrel Server hangs after Out of memory exception ("The connection listener failed to accept any new connections")

Open wite27 opened this issue 3 years ago • 3 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Describe the bug

Hi! I have an ASP.NET Core 5.0 application running in Docker in Kubernetes, which handles Web Socket connections. Each instance uses about 1Gb of memory. Sometimes, load increases and we get Out of memory exceptions, a process crashes and then is restarted by Kubernetes. But once we faced an issue that our application has just stopped to accept new connections, and the process hasn't been crashed. The last message from instance was: "The connection listener failed to accept any new connections", which is logged here https://github.com/dotnet/aspnetcore/blob/main/src/Servers/Kestrel/Core/src/Internal/ConnectionDispatcher.cs#L67 with a comment that describes the situation :)

Expected Behavior

I think that the process should crash like with another unhandled exceptions, because Kestrel can not accept new connections anyway

Steps To Reproduce

No response

Exceptions (if any)

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.Internal.SocketConnection..ctor(Socket socket, MemoryPool`1 memoryPool, PipeScheduler transportScheduler, ISocketsTrace trace, Nullable`1 maxReadBufferSize, Nullable`1 maxWriteBufferSize, Boolean waitForData, Boolean useInlineSchedulers)
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.SocketConnectionListener.AcceptAsync(CancellationToken cancellationToken)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.ConnectionDispatcher`1.<>c__DisplayClass9_0.<<StartAcceptingConnectionsCore>g__AcceptConnectionsAsync|0>d.MoveNext()

.NET Version

.NET 5.0.15

Anything else?

PS. We are using health checks, but they are listening at another port ("internal"), and in this situation they still respond with 200 OK

build info:

os_version: 	Linux 5.5.17-050517-generic #202004130833 SMP Mon Apr 13 12:37:30 UTC 2020
runtime_version: 	.NET 5.0.15
target_framework: 	.NETCoreApp,Version=v5.0

wite27 avatar May 06 '22 14:05 wite27

This is a good one! I'm pretty sure I wrote that comment...

davidfowl avatar May 06 '22 15:05 davidfowl

Thanks for contacting us.

We're moving this issue to the .NET 7 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s). If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

ghost avatar May 06 '22 20:05 ghost

The server isn't able to accept new connections at this point anyway so we should just crash here.

adityamandaleeka avatar May 06 '22 20:05 adityamandaleeka

@adityamandaleeka Do you still want this for rc2? It seems a little more relevant considering #43723.

halter73 avatar Sep 02 '22 22:09 halter73

Thanks for contacting us.

We're moving this issue to the .NET 8 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s). If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

ghost avatar Sep 09 '22 20:09 ghost

The same story in containers with memory limit. Now we add custom middleware to handle OOM and fail process, but I'd prefer it as embedded behaviour in Kestrel

Sergey-Terekhin avatar Sep 10 '22 04:09 Sergey-Terekhin