Kitura-net
Kitura-net copied to clipboard
Errors in removeIdleSockets
In my server log I see these errors:
[IncomingSocketManager.swift:252 removeIdleSockets(removeAll:)] epoll_ctl failure. Error code=1. Reason=Operation not permitted
It seems that the removeIdleSockets is getting the EPERM
exception when cleaning up idle connections.
I'm running my server using this Docker container. It is deployed on Amazon ECS.
My server is mostly a websocket server. There is a memory leak I'm trying to investigate and I wonder if this might be related.
I haven't seen this issue myself, though I am currently investigating a potential threading issue around removeIdleSockets
(issue #237) which could be related.
There are two threads on Linux which perform epoll_wait
, with connections distributed between them. These threads should be the only ones invoking epoll methods on their respective FDs, however when a new connection is received, we call removeIdleSockets
(at most, once every 5 seconds) to clear any stale ones. This is performed on a different thread, and I wonder if this EPERM
error is related to two threads trying to invoke functions on the same epoll FD concurrently.
That theory makes sense to me!
I just got another crash that seems related. All I got from the logs is this:
Fatal error: Trying to remove task, but it's not in the registry.: file Foundation/URLSession/TaskRegistry.swift, line 76
This has only happened once, so it is pretty rare. I don't see anything unusual in the logs beforehand.
@bridger I'm also getting this issue, have you found a solution?
@mikezander have you moved to Swift 5 recently? We've had a few reports of this and it looks like it's a bug in URLSession on Linux. There is a prototype fix here that we are hoping to get into Swift 5.0.1: https://github.com/apple/swift-corelibs-foundation/pull/2061
@ianpartridge No I actually haven't updated to Swift 5 yet. I'm still running Swift 4 on Kitura version 2.3.0, I was thinking I should update to 2.5.0, could that possibly fix the issue?
Hmm I can't replicate it but based off that bug it looks like the issue is Swift related.
Interesting. All the reports we have had so far are on Swift 5. The problem is definitely in Foundation not Kitura so I'm afraid upgrading Kitura is unlikely to help (although we would recommend you do that anyway as there are piles of improvements since version 2.3!).
Out of interest, are you running on Swift 4.0, 4.1 or 4.2? We are discussing how long to continue to support earlier versions of Swift, and user feedback would be very helpful.
As for your immediate problem, the only option I can suggest is to avoid using URLSession
on Linux :( How are you using URLSession
? Directly from your Kitura app or via a library like https://github.com/IBM-Swift/SwiftyRequest ? You might consider trying https://ibm-swift.github.io/Kitura-net/Classes/ClientRequest.html instead which uses libcurl
directly instead of URLSession
.
just to report the same issue [2019-12-09T02:31:10.976+01:00] [ERROR] [IncomingSocketManager.swift:295 removeIdleSockets(removeAll:runNow:)] epoll_ctl failure. Error code=1. Reason=Operation not permitted
Swift version 5.1 (swift-5.1.2-RELEASE) Target: x86_64-unknown-linux-gnu
Kitura 2.8.0
This makes it totally unusable as a lot of requests fail (even with just 10 concurrent requests and 100 requests so not exactly high load)
ab -n 100 -c 10 https://.../index This is ApacheBench, Version 2.3 <$Revision: 1843412 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking press.toys (be patient).....done
Server Software: Apache/2.4.41
Server Hostname:
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-CHACHA20-POLY1305,2048,256
Server Temp Key: ECDH X25519 253 bits
TLS Server Name:
Document Path: /index Document Length: 14023 bytes
Concurrency Level: 10 Time taken for tests: 5.114 seconds Complete requests: 100 Failed requests: 32 (Connect: 0, Receive: 0, Length: 32, Exceptions: 0) Non-2xx responses: 3 Total transferred: 1385997 bytes HTML transferred: 1371341 bytes Requests per second: 19.55 [#/sec] (mean) Time per request: 511.410 [ms] (mean) Time per request: 51.141 [ms] (mean, across all concurrent requests) Transfer rate: 264.66 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 61 92 16.7 90 156 Processing: 40 192 504.7 80 3030 Waiting: 39 185 505.7 69 3030 Total: 110 285 502.7 173 3106
Percentage of the requests served within a certain time (ms) 50% 173 66% 188 75% 223 80% 255 90% 339 95% 389 98% 3105 99% 3106 100% 3106 (longest request)
:(