go icon indicating copy to clipboard operation
go copied to clipboard

net/http: read: connection reset by peer under high load

Open liranp opened this issue 8 years ago • 20 comments

What version of Go are you using (go version)?

go1.8.3

What operating system and processor architecture are you using (go env)?

GOARCH="amd64" GOOS="linux"

What did you do?

  1. Edit /etc/sysctl.conf and reload with sysctl -p.
  2. Edit /etc/security/limits.conf.
  3. Execute the following server and client (on different hosts) to simulate multiple HTTP requests: Server https://play.golang.org/p/B3UCQ_4mWm Client https://play.golang.org/p/_XQRqp8K5b

What did you expect to see?

No errors at all.

What did you see instead?

Errors like the following:

Get http://<addr>:8080: read tcp [::1]:65244->[::1]:8080: read: connection reset by peer
Get http://<addr>:8080: write tcp [::1]:62575->[::1]:8080: write: broken pipe
Get http://<addr>:8080: dial tcp [::1]:8080: getsockopt: connection reset by peer
Get http://<addr>:8080: readLoopPeekFailLocked: read tcp [::1]:51466->[::1]:8080: read: connection reset by peer

liranp avatar Jul 09 '17 15:07 liranp

I've been seeing similar issues with a similar set up where the server is on Go 1.6 while the client, which is making many concurrent requests, is on Go 1.9.2. Are we aware of what causes this? Seems like some kind of unintentional in-build rate limiting?

teejays avatar Dec 01 '17 05:12 teejays

Go has no built-in rate limiting.

If you get broken network connections, they might be real (a peer really did disconnect), or they might be from hitting kernel limits on one of the sides.

bradfitz avatar Dec 01 '17 05:12 bradfitz

Can anybody reproduce this and investigate?

bradfitz avatar Jul 09 '18 20:07 bradfitz

Adding a comment to remind myself but:

I’ve seen similar issues with a should have time to investigate in the next month or so.

GeorgeErickson avatar Jul 12 '18 01:07 GeorgeErickson

In my opinion this issue occurs when the response is streamed into a dynamic allocated buffer.

No error is thrown when conn.Read() is send to a buffer with a fixed length. Error is thrown sporadic when response is read via: io.Copy(&buf, conn) ioutil.ReadAll(conn) conn.Read() streamed until EOF

Notice: even when the error is thrown, the response is buffered correct and complete.

GO Version I used: 1.10.3 OS: FreeBSD 11.2 (64 bit)

dubbelpunt avatar Aug 24 '18 07:08 dubbelpunt

I am also seeing this issue on http load testing.

Myabe I am hitting kernel tcp stack limits. @bradfitz any idea how to check if this is kernel issue?

drasko avatar Sep 17 '18 23:09 drasko

I am seeing this error frequently (read: connection reset by peer) when testing an HTTP 1.1 client that is generating heavy load on a server. I have not checked if the problem goes away with HTTP/2. (It could, since requests would be interleaved on a single connection).

I am running the client and server both on the same "localhost", and both are running a managed pool of "worker" goroutines in an attempt to limit the demand for resources.

The client and server each have 350+ goroutines (as reported by runtime.NumGoroutine())and are running as root on a Mac laptop with the max number of open files and sockets maxed out at 8192 (ulimit -n 8192).

I can run the same code on Ubuntu 18.04 LTS with more cores, memory, sockets, etc, and I eventually run into the same error message, although at a higher traffic load.

If there is a kernel limit, it would be nice to know what it is, so we could write our code so that it intentionally stays below that limit, rather than tweaking parameters and hoping it won't be a problem.

I see this topic was active a year ago, ... has there been any progress on this in the past year @bradfitz ?

dxjones avatar Nov 21 '18 03:11 dxjones

Can I retry the HTTP request in second?

mcauto avatar Nov 27 '18 09:11 mcauto

I also see this issue under macOS where server and client talk over localhost. I don't know the exact cause of the problem but running netstat during high load displays very large number of connections in TIME_WAIT state. Either I exhaust file descriptors or local ports.

Or both. I keep seeing connection reset by peer until the number of TIME_WAIT connections is about 7k. If I let the test run even further, I eventually get connect: can't assign requested address error which suggests that I exhausted local ports. The number of TIME_WAIT connections is at 15k.

For my project that's not needed and I could solve the problem by limiting number of concurrent connections. Solving the problem in my code by limiting number of goroutines seemed arbitrary and still occasionally failed. Limiting MaxConnsPerHost in http transport did the trick and problem went away completely.

creker avatar Dec 17 '18 16:12 creker

Also hitting this issue. Im running 1000 concurrent connections at it eventually hits this also on OSX with file descriptors bumped up from default (256) to 4096. See a lot of stuff about this, possibly only an issue on OSX

REPTILEHAUS avatar Feb 07 '19 15:02 REPTILEHAUS

I'm also running into this issue with the latest release of Go. No matter what I try, I keep getting these "connection reset by peer". Netstat show's hundreds of connections in TIME_WAIT state, even after shutting down my client.

transport = &http.Transport{ MaxIdleConns: 1, MaxIdleConnsPerHost: 1, IdleConnTimeout: 5 * time.Second }

The problem still happens with a Max Idle Connections of 1, and using runtime.GOMAXPROCS(1). I also try calling transport.CloseIdleConnections() after each goroutine, but according to netstat they still show up as "TIME_WAIT".

gaby avatar Feb 08 '19 04:02 gaby

I manage to solve my TIME_WAIT problem using this article: http://tleyden.github.io/blog/2016/11/21/tuning-the-go-http-client-library-for-load-testing/

gaby avatar Feb 10 '19 06:02 gaby

@gabrielcalderon it doesn't fix the problem completely. Idle connections do not limit the number of connections Go could open. If you exhausted idle connections for any reason, Go will start opening new ones and cause TIME_WAIT problem again. What fixes the problem is MaxConnsPerHost knob. It limits the number of connections Go could open and blocks subsequent requests if it's exhausted. You want to put it equal or higher than idle connections.

creker avatar Feb 10 '19 11:02 creker

Just to update. my issue was related to OSX and its defaults.. having tweakes the following settings I can now push 1000 concurrent transactions sudo ulimit -n 6049 sudo sysctl -w kern.ipc.somaxconn=1024

REPTILEHAUS avatar Feb 21 '19 13:02 REPTILEHAUS

The only way I was able to fix it was to limit the number of concurrent transactions to be half the number of MaxIdleConns.

gaby avatar Feb 21 '19 13:02 gaby

@REPTILEHAUS Thanks for that.. This worked well for me.. Don't think "ulimit" was an issue, as it was set to "unlimited" already.

Can someone give an insight into, what could be the side-effects, if I were to increase kern.ipc.somaxconn to an abnormally high number ?

pnsvinodkumar avatar Apr 22 '19 00:04 pnsvinodkumar

Just to update. my issue was related to OSX and its defaults.. having tweakes the following settings I can now push 1000 concurrent transactions sudo ulimit -n 6049 sudo sysctl -w kern.ipc.somaxconn=1024

Works for me on Mac OS @REPTILEHAUS .

iwind avatar Aug 31 '19 08:08 iwind

have same issue

gatspy avatar Nov 25 '21 16:11 gatspy

Just to update. my issue was related to OSX and its defaults.. having tweakes the following settings I can now push 1000 concurrent transactions sudo ulimit -n 6049 sudo sysctl -w kern.ipc.somaxconn=1024

For me, after doing this change, the error changed to connect: resource temporarily unavailable

AnubhavUjjawal avatar May 30 '22 20:05 AnubhavUjjawal

Thanks.

I personally think that this is not primarily because golang itself, almost nothing to do with it. We can only limit the connection in golang codebase. But to solve this, I believe it just a matter of kernel tuning in a desired OS, linux/mac/windows.

Open file descriptors and the maximum number of connections that can be queued according to @REPTILEHAUS' solution is still working like a charm in OSX Macbook Air M1, 16GB RAM. I was about to benchmark my golang app in high concurrency number, now there is no such connection reset in 1000 concurrency.

sudo ulimit -n 6049
sudo sysctl -w kern.ipc.somaxconn=1024

So, every machine have their own unique variable to tune its performance, it is just a matter how we adjust them accordingly.

gnomefin avatar Jun 27 '24 18:06 gnomefin

Thanks.

I personally think that this is not primarily because golang itself, almost nothing to do with it. We can only limit the connection in golang codebase. But to solve this, I believe it just a matter of kernel tuning in a desired OS, linux/mac/windows.

Open file descriptors and the maximum number of connections that can be queued according to @REPTILEHAUS' solution is still working like a charm in OSX Macbook Air M1, 16GB RAM. I was about to benchmark my golang app in high concurrency number, now there is no such connection reset in 1000 concurrency.

sudo ulimit -n 6049
sudo sysctl -w kern.ipc.somaxconn=1024

So, every machine have their own unique variable to tune its performance, it is just a matter how we adjust them accordingly.

Awesome, thx bro.

xmh1011 avatar Sep 10 '24 09:09 xmh1011