apns2 icon indicating copy to clipboard operation
apns2 copied to clipboard

Client.Timeout exceeded while awaiting headers

Open deamonwatcher opened this issue 8 years ago • 44 comments

in my program, i made 1000 goroutines for concurrency send msg to apple; when i check my log, i find a lot of error returns when i call client.Push, the error is "net/http: request canceled (Client.Timeout exceeded while awaiting headers)"; the timeout setting of the client is 10s:client.HTTPClient.Timeout = time.Duration(10) * time.Second; i have no idea why this happened, is the timeout setting too short?

deamonwatcher avatar May 29 '16 02:05 deamonwatcher

I have the same problem

bittoy avatar May 30 '16 07:05 bittoy

I think you will find this is the same issue as #17 and #20 You are setting a Timeout on the default HTTP Client, but the occasional TLS Connection is not succeeding.

From the go docs:

Timeout specifies a time limit for requests made by this Client. The timeout includes connection time, any redirects, and reading the response body.

Therefore whilst you are getting an request canceled error, it is actually because of the TLS connection. I have pushed back a branch called fix-timeout with the code suggested by @c3mb0. This includes a default Timeout on both the Client and the TLS connection. https://github.com/sideshow/apns2/tree/fix-timeout.

I think if you try this branch you will get TLS timeout errors as expected. Unfortunately, You will still need to handle the retry logic yourself until apple fixes at their end.

sideshow avatar May 31 '16 10:05 sideshow

It may be that you are doing nothing wrong and Apple has patched the error at their end. I just tried now with the same code and i wasn't able to reproduce so hopefully they have fixed it.

We pushed back a change to master which has better timeout support. It has timeouts for both the TLS Connection and a standard Timeout, and both of these are configurable. Please try master and let me know if the problem still persists, otherwise i'll close this issue. Thanks https://github.com/sideshow/apns2/commit/14b46f872a11ff8501a315d16d848975772f2775

sideshow avatar Jun 11 '16 23:06 sideshow

i have updated the code from master branch and recompile my application, after runing for a period of time, unfortunately, the problem still persists: "Post https://api.push.apple.com/3/device/xxxxxx: net/http: request canceled (Client.Timeout exceeded while awaiting headers)"

deamonwatcher avatar Jun 13 '16 08:06 deamonwatcher

@deamonwatcher what timeout value are you using. Are you using the default timeout or have you got a smaller timeout?

sideshow avatar Jun 13 '16 08:06 sideshow

I use the default timeout,I think it's long enough

deamonwatcher avatar Jun 13 '16 08:06 deamonwatcher

And can you just confirm that apns2.TLSDialTimeout is less than apns2.HTTPClientTimeout I am guessing you have left this as default too (20 secs)

sideshow avatar Jun 13 '16 10:06 sideshow

apns_demo.txt I have uploaded a simplified version of my application for your reference; There are a few things that I need to point out:

  1. I didn't set any timeout config in my code, so the default timeout of HTTPClientTimeout is 30 secs,and the TLSDialTimeout is 20 secs;
  2. I need to send tens of millions of notifications to my customers every day, I'm not sure the error has some relationships with the high concurrence;
  3. somebody said it maybe a dns error of apns (Check here), but I can't reproduce it after trying many times, the ips returns from apns are all good ips;
  4. before compile my application, I set GODEBUG=http2debug=2, but during the run time, there are no http2 logs are outputted.

deamonwatcher avatar Jun 14 '16 06:06 deamonwatcher

Thanks for the extra info. I am going to try and reproduce this.

sideshow avatar Jun 16 '16 04:06 sideshow

  1. before compile my application, I set GODEBUG=http2debug=2, but during the run time, there are no http2 logs are outputted.

GODEBUG=http2debug=2 or GODEBUG=http2debug=1 is something you do at runtime. E.g.:

GODEBUG=http2debug=1 ./myapp

nathany avatar Jun 16 '16 17:06 nathany

@nathany thanks for your reminder. I add a prefix of "GODEBUG=http2debug=2" to my application at runtime. and reproduce the problem. here is the erro infomation: 2016/06/21 09:42:45 http2: Transport encoding header ":authority" = "api.push.apple.com" 2016/06/21 09:42:45 http2: Transport encoding header ":method" = "POST" 2016/06/21 09:42:45 http2: Transport encoding header ":path" = "/3/device/e104cfb645356df28f0e0d007725a19979f27f4849e838c508180058cxxxx" 2016/06/21 09:42:45 http2: Transport encoding header ":scheme" = "https" 2016/06/21 09:42:45 http2: Transport encoding header "content-type" = "application/json; charset=utf-8" 2016/06/21 09:42:45 http2: Transport encoding header "apns-topic" = "com.compayname" 2016/06/21 09:42:45 http2: Transport encoding header "apns-expiration" = "1466480534" 2016/06/21 09:42:45 http2: Transport encoding header "content-length" = "205" 2016/06/21 09:42:45 http2: Transport encoding header "accept-encoding" = "gzip" 2016/06/21 09:42:45 http2: Transport encoding header "user-agent" = "Go-http-client/2.0" 2016/06/21 09:42:45 http2: Framer 0xc820cbc000: wrote HEADERS flags=END_HEADERS stream=1453 len=74 2016/06/21 09:42:45 RoundTrip failure: net/http: request canceled

besides, I attach the right trace infomation of a successful send notification: 2016/06/21 09:42:45 http2: Transport encoding header ":authority" = "api.push.apple.com" 2016/06/21 09:42:45 http2: Transport encoding header ":method" = "POST" 2016/06/21 09:42:45 http2: Transport encoding header ":path" = "/3/device/94ddb890928d21cf18c8e149a9ab18ba4afb3a5eb385ff4b0ba1b2xxxx" 2016/06/21 09:42:45 http2: Transport encoding header ":scheme" = "https" 2016/06/21 09:42:45 http2: Transport encoding header "content-type" = "application/json; charset=utf-8" 2016/06/21 09:42:45 http2: Transport encoding header "apns-topic" = "com.companyname" 2016/06/21 09:42:45 http2: Transport encoding header "apns-expiration" = "1466480549" 2016/06/21 09:42:45 http2: Transport encoding header "content-length" = "205" 2016/06/21 09:42:45 http2: Transport encoding header "accept-encoding" = "gzip" 2016/06/21 09:42:45 http2: Transport encoding header "user-agent" = "Go-http-client/2.0" 2016/06/21 09:42:45 http2: Framer 0xc820cbc000: wrote HEADERS flags=END_HEADERS stream=1455 len=64 2016/06/21 09:42:45 http2: Framer 0xc820cbc000: wrote DATA stream=1455 len=205 data="{"aps":{"alert":"hello world","badge":1,"sound":"default"}}" 2016/06/21 09:42:45 http2: Framer 0xc820cbc000: wrote DATA flags=END_STREAM stream=1455 len=0 data=""

deamonwatcher avatar Jun 21 '16 02:06 deamonwatcher

It will occur if message was sended after connection had been idle for a long time( 30-60 minutes).

piouswolf avatar Jun 25 '16 14:06 piouswolf

Hey @deamonwatcher

Thanks again for your example code. I have been able to reproduce this RoundTrip failure: net/http: request canceled error.

The error appears to be happening because you are flooding the HttpClient with more requests than it can send in a reasonable time to APNS.

The underlying go HttpClient starts the Timeout timer when you call client.Push(notification), Because you are sending so many notifications the ones that are timing out have been waiting for longer than 30 secs.

A couple of ways you could fix;

  1. Set HTTPClientTimeout to zero. ie, apns2.HTTPClientTimeout = 0 This will make it never timeout. It will avoid the issue you are having, the down side to this is that if you ever have a request that hangs (rare but could happen if there was an APNS issue or http issue etc) it wont call timeout due to the way the underlying go client is written, so you wont get an error back.
  2. Set HTTPClientTimeout to a large number. If you put it up to say 2-3 mins, you may find you wont get these errors any more, but you will still get the advantage of catching long running requests and returning an error so you can retry these.
  3. Reduce the amount of go routines you are using (currently 1000) and use a buffered channel, this will introduce some back pressure and make sure that you are clearing out the enqueued pushes as fast as you can queue them up.

Hope this makes sense, I will be adding more to the docs to explain this. Please let me know how you get on as i'm keen to see if this resolves the issue.

@piouswolf I can't seem to trigger this idle connection scenario. Have tried multiple times with 1m, 15m, 30m, 45m and 1hr and the go HttpClient will always transparently reconnect. I think what may happen in this case (if you have seen this) is APNS is sending a GOAWAY http2 frame.

sideshow avatar Jun 26 '16 09:06 sideshow

@sideshow we still have this problem now, any update on this issue? we are using the latest master code.

NeoCN avatar Jul 19 '16 08:07 NeoCN

@sideshow @NeoCN
I found it happened if network was not stable between your server to APNs. It works well on my production servers which are deployed in Europe and America.

piouswolf avatar Jul 20 '16 02:07 piouswolf

@piouswolf thanks for the info. @NeoCN Did you try any of the three suggestions above and if so which ones? The problem is most probably that you are smashing the HttpClient with more notifications than the underlying connection can send, so the ones that cant send are timing out. This could be because you are on a flakey internet connection or sending too many on too many go routines. How many notifications are you sending at once? Regardless one of the above methods should resolve for you.

sideshow avatar Jul 20 '16 06:07 sideshow

This patch in Go 1.7 rc6 (and x/net/http2) may be related. https://github.com/golang/go/commit/7a622740655bb5fcbd160eb96887032314842e6e

nathany avatar Aug 08 '16 20:08 nathany

I have the same problem too. @sideshow There's only three gorouting to send notifications that I have used,

PS:

  1. I 'am in china, the network is not stable
  2. Only one instance of client that I used
  3. The timeout is default

nange avatar Sep 08 '16 10:09 nange

@nange What version of Go are you running? (go version) And what platform are you running this on? (Linux, macOS, Windows)

nathany avatar Sep 08 '16 15:09 nathany

@nathany go version: 1.6.3 platform: Linux(Centos 6.5)

like @piouswolf says: The errors always occur after connection had been idle for a long time

nange avatar Sep 09 '16 02:09 nange

The problem is related to the Provider setting IdleTimeOut for idle TCP connections. If you don't use the connection, it'll be automatically halted in minues or hours.

You'll have to send blank data every minute in order to keep it open, or contact your server provider to change the IdleTimeout of your server.

AbdullahDiaa avatar Oct 07 '16 07:10 AbdullahDiaa

@sideshow ,first, i ‘m sorry for not reply to you for such a long time, i'm a little busy a short time ago. Follow your advice, i made the following changes:

  1. Set HTTPClientTimeout to 200s
  2. Reduce the amount of go routines to 500(before is 1000)
  3. Use a buffered channel, and the capacity is 100

Now, the numbers of timeout error is much less than before, but it still have a fatal problem: sending speed. When the amount of notifications is relatively small, the average response time of calling client.Push() is about several hundreds milliseconds; but when i increase the amounts(about 10000 notifications per minutes), the average response time is very long, tens of seconds or even a few minutes.

This is unacceptable to my business, and i don't have a good way to solve this problem. Did you have any other advice?

My server environment: go version: 1.7 os verison: Red Hat 4.4.7-4

deamonwatcher avatar Oct 15 '16 16:10 deamonwatcher

@deamonwatcher this seems like a better channel/goroutine setup. Your speed is concerning. I have been able to get up to 5-10k pushes per second from Amazon AWS servers. Is your server located in the states and what is the network performance like? Also are you holding on to the client and not setting up a new one each time.

sideshow avatar Nov 03 '16 07:11 sideshow

I just faced this problem with 3 workers on recent master. Probably it was idle timeout, because we executing DEV testing right now, but it does not reconnect. I see about 6 errors for 1.5 minute time frame. Going to enable http2 and increase timeout to provide more logs.

But 2 hours later when I try to send another message, everything was fine

melkus avatar Nov 18 '16 03:11 melkus

@melkus some logs would be good. Thanks

sideshow avatar Nov 18 '16 04:11 sideshow

Sorry for delay, but we are developing and were playing with different apple certs dev/prod/voip.

Push service was started at 2016/11/21 17:24:27 it runs 3 workers. Each worker has its own apns2 client. Should I use 1 client? At 22:51:36 we got new message and one worker sent it to apple. Everything is fine. Then it was idle for a night. at 7:00:30 AM next day it received 3 messages via queue bus and started processing. Secondary and third workers were ok and processed 2 additional messages at 7:00:31. First worker failed at 07:01:30.

Next messages arrived at 07:05:04 and again some of them were successfully sent, some not. I am going to give a name to each worker to have it in logs. And re-create apns2 client on this error.

Interesting errors/warnings: 2016/11/22 07:00:31 Unhandled Setting: [HEADER_TABLE_SIZE = 4096] 2016/11/22 07:00:31 Unhandled Setting: [MAX_HEADER_LIST_SIZE = 8000] 2016/11/22 07:16:44 Transport readFrame error: (*net.OpError) read tcp 10.64.21.136:42491->17.188.166.96:443: read: connection timed out

Yes, "connection timed out" happened at 07:16 only. It is the first entry of this error in logs.

http2 logs, B8CA0FC17F615B0B009A4F2633B7A7DB2F745B748C22EDEDF1A96 token failed:

2016/11/22 07:00:30 http2: Transport encoding header ":authority" = "api.push.apple.com" 2016/11/22 07:00:30 http2: Transport encoding header ":method" = "POST" 2016/11/22 07:00:30 http2: Transport encoding header ":path" = "/3/device/B8CA0FC17F615B0B009A4F2633B7A7DB2F745B748C22EDEDF1A96581C4D62C94" 2016/11/22 07:00:30 http2: Transport encoding header ":scheme" = "https" 2016/11/22 07:00:30 http2: Transport encoding header "content-type" = "application/json; charset=utf-8" 2016/11/22 07:00:30 http2: Transport encoding header "apns-topic" = "com.fonality.hudio" 2016/11/22 07:00:30 http2: Transport encoding header "content-length" = "153" 2016/11/22 07:00:30 http2: Transport encoding header "accept-encoding" = "gzip" 2016/11/22 07:00:30 http2: Transport encoding header "user-agent" = "Go-http-client/2.0" 2016/11/22 07:00:31 http2: Transport creating client conn to 17.188.154.35:443 2016/11/22 07:00:31 http2: Transport encoding header ":authority" = "api.push.apple.com" 2016/11/22 07:00:31 http2: Transport encoding header ":method" = "POST" 2016/11/22 07:00:31 http2: Transport encoding header ":path" = "/3/device/BEE33068F845440C13997BEF9307755BFCFC8AC6A76E9E64FEB5A3B13E35D2CE" 2016/11/22 07:00:31 http2: Transport encoding header ":scheme" = "https" 2016/11/22 07:00:31 http2: Transport encoding header "content-type" = "application/json; charset=utf-8" 2016/11/22 07:00:31 http2: Transport encoding header "apns-topic" = "com.fonality.hudio" 2016/11/22 07:00:31 http2: Transport encoding header "content-length" = "153" 2016/11/22 07:00:31 http2: Transport encoding header "accept-encoding" = "gzip" 2016/11/22 07:00:31 http2: Transport encoding header "user-agent" = "Go-http-client/2.0" 2016/11/22 07:00:31 http2: Transport creating client conn to 17.188.154.35:443 2016/11/22 07:00:31 http2: Transport encoding header ":authority" = "api.push.apple.com" 2016/11/22 07:00:31 http2: Transport encoding header ":method" = "POST" 2016/11/22 07:00:31 http2: Transport encoding header ":path" = "/3/device/CBA032BB7F3E94E3EABD68DBE22F8E1BA6EBFD181C05C305BDCA5733D83F54F0" 2016/11/22 07:00:31 http2: Transport encoding header ":scheme" = "https" 2016/11/22 07:00:31 http2: Transport encoding header "apns-topic" = "com.fonality.hudio" 2016/11/22 07:00:31 http2: Transport encoding header "content-type" = "application/json; charset=utf-8" 2016/11/22 07:00:31 http2: Transport encoding header "content-length" = "153" 2016/11/22 07:00:31 http2: Transport encoding header "accept-encoding" = "gzip" 2016/11/22 07:00:31 http2: Transport encoding header "user-agent" = "Go-http-client/2.0" 2016/11/22 07:00:31 http2: Transport received SETTINGS flags=ACK len=0 2016/11/22 07:00:31 http2: Transport received SETTINGS flags=ACK len=0 2016/11/22 07:00:31 http2: Transport received HEADERS flags=END_STREAM|END_HEADERS stream=1 len=37 2016/11/22 07:00:31 http2: Transport encoding header ":authority" = "api.push.apple.com" 2016/11/22 07:00:31 http2: Transport encoding header ":method" = "POST" 2016/11/22 07:00:31 http2: Transport encoding header ":path" = "/3/device/916722539CE4EB223615E2A2AB64135FC844A4EE7E62409A8353EA278487B7C1" 2016/11/22 07:00:31 http2: Transport encoding header ":scheme" = "https" 2016/11/22 07:00:31 http2: Transport encoding header "content-type" = "application/json; charset=utf-8" 2016/11/22 07:00:31 http2: Transport encoding header "apns-topic" = "com.fonality.hudio" 2016/11/22 07:00:31 http2: Transport encoding header "content-length" = "153" 2016/11/22 07:00:31 http2: Transport encoding header "accept-encoding" = "gzip" 2016/11/22 07:00:31 http2: Transport encoding header "user-agent" = "Go-http-client/2.0" 2016/11/22 07:00:31 http2: Transport received HEADERS flags=END_STREAM|END_HEADERS stream=1 len=38 2016/11/22 07:00:31 http2: Transport encoding header ":authority" = "api.push.apple.com" 2016/11/22 07:00:31 http2: Transport encoding header ":method" = "POST" 2016/11/22 07:00:31 http2: Transport encoding header ":path" = "/3/device/AA93C40429DA6C841364FBF57DE2735E7B410CB968D22B373E450F920127A959" 2016/11/22 07:00:31 http2: Transport encoding header ":scheme" = "https" 2016/11/22 07:00:31 http2: Transport encoding header "content-type" = "application/json; charset=utf-8" 2016/11/22 07:00:31 http2: Transport encoding header "apns-topic" = "com.fonality.hudio" 2016/11/22 07:00:31 http2: Transport encoding header "content-length" = "153" 2016/11/22 07:00:31 http2: Transport encoding header "accept-encoding" = "gzip" 2016/11/22 07:00:31 http2: Transport encoding header "user-agent" = "Go-http-client/2.0" 2016/11/22 07:00:31 http2: Transport received HEADERS flags=END_HEADERS stream=3 len=32 2016/11/22 07:00:31 http2: Transport received DATA flags=END_STREAM stream=3 len=27 data="{"reason":"BadDeviceToken"}" 2016/11/22 07:00:31 http2: Transport received HEADERS flags=END_STREAM|END_HEADERS stream=3 len=31

My logs: 2016/11/22 07:01:30 RoundTrip failure: net/http: request canceled 2016/11/22 07:01:30 Post https://api.push.apple.com/3/device/B8CA0FC17F615B0B009A4F2633B7A7DB2F745B748C22EDEDF1A96581C4D62C94: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

melkus avatar Nov 23 '16 01:11 melkus

apns.TLSDialTimeout = 40 * time.Second apns.HTTPClientTimeout = 60 * time.Second

melkus avatar Nov 23 '16 01:11 melkus

i have the same problem. When i get this error, i recreate Client. Mostly works. i will try using Client with maxAge.

metinn avatar Dec 27 '16 11:12 metinn

@deamonwatcher hi, i also have the same problem. the following is my configuration: 1.HTTPClientTimeout defaulttimeout 2.goroutines is runtime.NumCPU() * 50 and apns2.Client is runtime.NumCPU() * 25 3.each coroutines has its own channel about send 500 per seconds and also have the timeout error. do you have a good way to improve the performance, if you have, please help me, thanks! ps:Whether I'll be baned ip about so many client connected? - -

silver0511 avatar Dec 28 '16 11:12 silver0511

Just to add to this thread: I was playing with the mattermost-push-proxy server which uses this package for sending out push notifications to Apple devices. I was seeing the Client.Timeout error followed by a panic and crash in the server. The panic was due to the fact that the code was first looking at the error response from the apns call, and then it was inspecting the response of the same call even if there was an error and the response was nil.

I fixed that, but I was still getting those errors. However, I only get them when the connection has been opened for some time without any activity. Once I attempt to send another notification, there is a delay, then the Client.Timeout error. Now, I just recreate the apns connection and try again. My guess is that the idle socket connection gets closed on the remote side but the near side does not see or does not process the remote close. A subsequent attempt to use the socket appears to work, but nothing coming back from the remote side hence the timeout.

bradhowes avatar Feb 21 '17 23:02 bradhowes