vibe.d icon indicating copy to clipboard operation
vibe.d copied to clipboard

Possible file descriptor leak

Open xorgy opened this issue 11 years ago • 14 comments

I've been running httperf tests on a test page rendering one diet template with no parameters, and have come to an issue whereby vibe will first run out of socket FDs, complain for a while with Error accepting an incoming connection: 24 (but still serve requests once it settles down), then after enough of that, simply fail to serve requests.

I think there may be a file descriptor leak somewhere, but I haven't had an opportunity to confirm that as of yet, I will update when I can.

xorgy avatar Jul 06 '13 00:07 xorgy

If it is a performance test, your host OS is quite likely to run out of file descriptors / sockets, that is not a bug. Please check if your ulimit -n is bigger than number of simultaneous connections and provide both numbers otherwise.

mihails-strasuns avatar Jul 06 '13 08:07 mihails-strasuns

If you do tests with large numbers of concurrent connections, it's easy to reach the user defined FD limit. ulimit -n for example only prints 1024 on my Ubuntu box by default. I've never had an FD leak, either on the live site or on benchmark tests (usually with 1M requests), so this sounds very much like it's the case.

s-ludwig avatar Jul 06 '13 08:07 s-ludwig

8 hours and then at the same second... not bad ;)

s-ludwig avatar Jul 06 '13 08:07 s-ludwig

Good morning sir :D

mihails-strasuns avatar Jul 06 '13 08:07 mihails-strasuns

It runs out of the connections forever, even after the connections close. I'm well aware that hitting limits is natural, but hitting them and not coming back is an issue.

I push fairly simple requests at it at a connection rate of about 1100/s, then it gives an errno of 24 through the error printer on line 465 of libevent2_tcp.d as expected, but if I do this long enough, it cannot serve connections anymore even if you let it cool down for a few minutes and stop printing error messages, and is blocked by more errno 24(hinting something like a FD leak).

Doesn't work with a local tester since it attempts to allocate the socket and fails before the other gets to open one to serve the request.

xorgy avatar Jul 06 '13 15:07 xorgy

Even though the connections are served mostly sub-millisecond, at a rate of less than the user limit every second(concurrent connections not even hitting the ulimit) it eats up the file descriptors permanently.

xorgy avatar Jul 06 '13 16:07 xorgy

I'm seeing this issue occur on a project of mine also. The error message I'm receiving is: vibe.crypto.cryptorand.CryptoException@../../.dub/packages/vibe-d-0.7.23/source/vibe/crypto/cryptorand.d(91): Cannot init SystemRNG: Error id is 24, Error message is: "Cannot open file /dev/urandom' in moderb' (Too many open files)"

I dont know if this is a problem with vibes SystemRNG class or further underneath it.

Anyone have any ideas?

grogancolin avatar Jul 29 '15 12:07 grogancolin

@grogancolin, does that occur at application startup or after the application is been running for a while? I'm asking because under normal circumstances the SystemRNG is only instantiated once per thread, which usually only occurs once at startup.

s-ludwig avatar Jul 29 '15 12:07 s-ludwig

@s-ludwig It only occurs after it's running for a while. I'm currently trying to reproduce the problem while keeping track of the number of file descriptors open to see if it's increasing.

I'm using lsof -p <PID of vibe.d process> to keep track of the file descriptors. Hopefully I'll have further useful information soon.

grogancolin avatar Jul 29 '15 12:07 grogancolin

@s-ludwig So, in our project, we were using SystemRNG to generate a random keys. Our code created a new SystemRNG each time, and this leaked the file descriptors.

I added a ~this(file.close() ) to my local vibe-d code, however the GC never seems to clean up that class for some reason. Unsure as to why.

While the underlying problem should be fixed (I'm not sure how, I scanned the code but nothing other than the destructor jumped out at me), it is easy to work around from my end - simply reuse the SystemRNG object!

grogancolin avatar Jul 29 '15 19:07 grogancolin

I narrowed down one method of getting a file descriptor leak: use requestHTTP on at least 17 different URLs in different domains. Attached test case.

With 15 or fewer domains, there is no leak.

With 17 domains, it leaks a little less than one file descriptor per request.

Tested on Ubuntu 16.04, Vibe 0.8.1, dmd v2.076.0, x64.

vibe_fd_leak_test_case.txt

dhasenan avatar Nov 24 '17 02:11 dhasenan

http/vibe/http/client.d:216

static vibe.utils.array.FixedRingBuffer!(Tuple!(ConnInfo, ConnectionPool!HTTPClient), 16) s_connections;

Looks suspicious.

Using HTTPClient directly avoids this problem.

dhasenan avatar Nov 24 '17 03:11 dhasenan

Correction: using HTTPClient directly fixes the problem for that constrained case. In my actual code, I also had to call HTTPClient.disconnect. Calling HTTPClientResponse.disconnect with requestHTTP didn't help.

dhasenan avatar Nov 24 '17 04:11 dhasenan

I used the same method as https://github.com/vibe-d/vibe.d/issues/1462 to test (adding a writeln of /proc/self/fd). The test case is @dhasenan 's with the addition writeln / imports:

/+
  dub.json:
  {
   "name": "test",
   "dependencies": {
    "vibe-d": "*"
   }
}
+/
import std.stdio;
import std.file;
import std.range;
import vibe.d;

void main()
{
    auto urls = [
        "http://www.liberafolio.org/feed/",
        "http://what-if.xkcd.com/feed.atom",
        "http://www.pusheen.com/rss",
        "http://www.skindeepcomic.com/feed/",
        "http://wattchewant.tumblr.com/tagged/nsfw/rss",
        "http://lesswrong.com/.rss",
        "https://www.youtube.com/feeds/videos.xml?channel_id=UCOGeU-1Fig3rrDjhm9Zs_wg",
        "http://www.miserez.org/ws/xkcdmobile/rss.xml",
        "http://fullten.tumblr.com/rss",
        "http://grrlpowercomic.com/feed",
        "http://procedural-generation.tumblr.com/rss",
        "http://blog.duangle.com/feeds/posts/default?alt=rss",
        "http://trextrying.tumblr.com/rss",
        "http://blog.franchesca.net/rss",
        "http://thepunchlineismachismo.com/feed",
        "http://feeds.feedburner.com/satwcomic",
        "http://monstergirlisland.com/rss",
    ];
    writefln("%s urls", urls.length);
    size_t i = 0;
    while (true)
    {
        foreach (url; urls)
        {
            requestHTTP(url,
                (scope HTTPClientRequest req) { req.method = HTTPMethod.GET; },
                (scope HTTPClientResponse resp) {
                    try
                    {
                        resp.bodyReader.readAllUTF8;
                    }
                    catch (Exception e)
                    {
                    }
                });
            i++;
            writeln("Number of file descriptors: ", dirEntries("/proc/self/fd/", SpanMode.breadth, false).walkLength);
            if (i % 100 == 0) writefln("finished %s requests", i);
        }
    }
}    

If resp.bodyReader.readAllUTF8; is commented out, there is no leak, we stay at a cool 7 fd. However, with the above code, we get:

Number of file descriptors: 97
Number of file descriptors: 98
Number of file descriptors: 99
Number of file descriptors: 100
Number of file descriptors: 101
finished 100 requests
Number of file descriptors: 102
Number of file descriptors: 103
Number of file descriptors: 104
Number of file descriptors: 104
Number of file descriptors: 105

So the issue still exists.

Geod24 avatar Jun 23 '21 07:06 Geod24