vibe.d icon indicating copy to clipboard operation
vibe.d copied to clipboard

leaking eventcore driver because there are still active handles

Open mengu opened this issue 6 years ago • 46 comments

hello everyone

i need to build a websocket server that needs to talk to cups. i've got that part sorted out. i've tried to give the websocket example a try.

vibe.d version 0.8.4: compiles well however when i quit the server, i got this message: leaking eventcore driver because there are still active handles vibe.d version 0.8.5.beta: does not even run, it exits with the same error.

os: macOS high sierra (10.13.6) kernel: Darwin Kernel Version 17.7.0: Thu Jun 21 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64 dmd: DMD64 D Compiler v2.083.1 ldc: LDC - the LLVM D compiler (1.7.0):

here's the sample code:

import std.stdio;
import std.string;
import std.conv;
import core.time;

import vibe.core.core : sleep;
import vibe.core.log;
import vibe.http.router : URLRouter;
import vibe.http.server;
import vibe.http.websockets : WebSocket, handleWebSockets;

void handleWebSocketConnection(scope WebSocket socket) {
  int counter = 0;
  logInfo("Got new web socket connection.");
  while (true) {
    sleep(1.seconds);
    if (!socket.connected) break;
    counter++;
    logInfo("Sending '%s'.", counter);
    socket.send(counter.to!string);
  }
  logInfo("Client disconnected.");
}

shared static this() {
  auto router = new URLRouter;
  router.get("/ws", handleWebSockets(&handleWebSocketConnection));

  auto settings = new HTTPServerSettings;
  settings.port = 8080;
  settings.bindAddresses = ["::1", "127.0.0.1"];
  listenHTTP(settings, router);
}

mengu avatar Dec 26 '18 19:12 mengu

Looks like you just forgot about explicitly closing any open websockets when the server process is being terminated, hence 'there are still active handles'. For a simple test like this, that may probably be ignored.

thaven avatar Dec 27 '18 15:12 thaven

this actually happens even without any connections @thaven. also, with 0.8.5 beta the app does not even start.

mengu avatar Dec 27 '18 19:12 mengu

This is reproducible on my system (and in my project). Further, it is reproducible with the app_skeleton example (even terminating almost immediately after launch) (I haven't tried any other ones, but I would suspect that applies there as well). In app_skeleton, I'm seeing the message twice, verbatim:

Warning (thread: main): leaking eventcore driver because there are still active handles
   FD 6 (streamListen)
Warning (thread: main): leaking eventcore driver because there are still active handles
   FD 6 (streamListen)

I see this behavior in the following environments: a) os: Arch Linux amd64 kernel: Linux LapToast 4.20.12-arch1-1-ARCH #1 SMP PREEMPT Sat Feb 23 15:11:34 UTC 2019 x86_64 GNU/Linux dmd: DMD64 D Compiler v2.084.1 LDC - the LLVM D compiler (1.14.0): b) os: Ubuntu Cosmic (as LXD guest under Alpine Edge) kernel: Linux william 4.19.25-0-vanilla #1-Alpine SMP Mon Feb 25 08:42:23 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux dmd: DMD64 D Compiler v2.084.1 ldc: LDC - the LLVM D compiler (1.11.0):

EDIT: This is particularly annoying because the trailing sockets (which apparently do exist) prevent whatever application is/was running from restarting (it fails to bind). fuser -k is no help, and one seemingly needs to wait for the connection to drop. Perhaps closing the eventloop should forcefully terminate any current connections?

CosmicToast avatar Feb 27 '19 05:02 CosmicToast

Yep, seeing it with:

module app;

import vibe.http.router;
import vibe.web.rest;

shared static this ()
{
    auto router = new URLRouter();
    auto settings = new HTTPServerSettings("0.0.0.0");
    settings.port = 4245;
    router.registerRestInterface(new Server1());
    listenHTTP(settings, router);
}

@safe:
interface API1 { string getGreeting(); }
class Server1 : API1 { string getGreeting() { return "API1";  } }

As well...

Geod24 avatar Mar 11 '19 16:03 Geod24

I'm getting that error message with Redis as well on vibe 0.8.4

dedupely avatar Mar 20 '19 18:03 dedupely

Okay, so this is because runApplication(); is missing. A call to runApplication(); is required immediately after listenHTTP. I can reproduce this behaviour without runApplication():

import vibe.d;

class CheckTODO {
    @path("/health")
    void health(HTTPServerRequest req, HTTPServerResponse res) {
        res.writeBody("ok");
    }
}

void main() {
    auto router = new URLRouter;
    router.registerWebInterface(new CheckTODO);

    auto settings = new HTTPServerSettings;
    settings.port = 8080;
    listenHTTP(settings, router);
    // runApplication();
}
Failed to listen on 0.0.0.0:8080
Warning (thread: main): leaking eventcore driver because there are still active handles
   FD 9 (streamListen)
Warning (thread: main): leaking eventcore driver because there are still active handles
   FD 9 (streamListen)

However, if I uncomment runApplication, it works just fine.

judepereira avatar Jun 24 '19 19:06 judepereira

I'm experiencing this with vibe-d's provided entrypoint, which should just run runApplication (https://github.com/vibe-d/vibe.d/blob/master/core/vibe/appmain.d), so there may be more to it than that.

CosmicToast avatar Jun 24 '19 19:06 CosmicToast

Ah, I missed that, sorry.

judepereira avatar Jun 24 '19 19:06 judepereira

You have to start the eventloop either way. This is the server not cleaning up before exit from what I understand. Maybe it has something to do with a lingering route. Honestly not sure. Do you guys know if this results in a non 0 exit code? If so it's bad for things like docker container orcestration, which I use.

On Mon, Jun 24, 2019, 2:21 PM Jude Pereira [email protected] wrote:

Ah, I missed that, sorry.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vibe-d/vibe.d/issues/2245?email_source=notifications&email_token=AHF3S727PEIIDBRWK53YQ4DP4ENFDA5CNFSM4GMIDZ52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYN6LHA#issuecomment-505144732, or mute the thread https://github.com/notifications/unsubscribe-auth/AHF3S74FKFXFSCX4RWHZJELP4ENFDANCNFSM4GMIDZ5Q .

dedupely avatar Jun 26 '19 02:06 dedupely

Exit code is still 0 in my experience. However, it results in the port not being bindable for a short period of time afterwards, which is very annoying for server restarts and/or development.

CosmicToast avatar Jun 26 '19 03:06 CosmicToast

In that case I'm not so worried. The only thing I'm worried (a little) about it that routes are still processing jobs at exit. Makes 0 downtime deploys not entirely possible. However, from the looks of things, the fds were not cleaned up properly, that could be all. I had the same error message with vibed-redis and had to switch to tinyredis to solve it.

On Tue, Jun 25, 2019 at 10:15 PM Chloe Kudryavtsev [email protected] wrote:

Exit code is still 0 in my experience.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vibe-d/vibe.d/issues/2245?email_source=notifications&email_token=AHF3S77C5ZNHD6G6EJVGI7TP4LNLNA5CNFSM4GMIDZ52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYSGBGY#issuecomment-505700507, or mute the thread https://github.com/notifications/unsubscribe-auth/AHF3S73ZPBNLFPDITAYR7OTP4LNLNANCNFSM4GMIDZ5Q .

-- Clinton Skakun Founder and CEO Dedupely Software, Inc. | Dedupe.ly https://dedupe.ly/

HQ: 11871 Horseshoe Way #1103, Richmond, BC V7A 5H5, Canada

Latam: 333c Interior Calle 12 x 1A y calle cerrada, Col. Vista Alegre Norte 97130 Mérida, Yucatán

dedupely avatar Jun 26 '19 14:06 dedupely

I'm still having issues with this and strangely it appears that the port is not being let go of (not sure if this makes sense). Because on restart it crashes to due to the port being in use. Maybe 'leaking eventcore' has something to do with the http listener not closing?

dedupely avatar Jul 08 '19 19:07 dedupely

The port reuse issue will be fixed with vibe-core 1.7.0 (already available in the latest beta), the default for SO_REUSEADDR accidentally changed to 0 in a previous release.

A generic, built-in way to wait for all incoming HTTP requests to complete is still missing, but listenHTTP returns a HTTPListener that has a stopListening method to close the listening socket.

s-ludwig avatar Jul 09 '19 09:07 s-ludwig

The port reuse issue will be fixed with vibe-core 1.7.0 (already available in the latest beta), the default for SO_REUSEADDR accidentally changed to 0 in a previous release.

A generic, built-in way to wait for all incoming HTTP requests to complete is still missing, but listenHTTP returns a HTTPListener that has a stopListening method to close the listening socket.

Thanks this is what I needed to hear. It will be an easy fix once we have 1.7.0 in production. Can I use HTTPServerOption.reusePort in the meantime? You mentioned the default but is this configurable?

dedupely avatar Jul 09 '19 15:07 dedupely

Hmm, this actually appears to have fixed the issue:

I updated to 1.7.0 beta2 (probably does not change much in this leaking case).

Because my app uses it's own main and I call runEventLoop(); to start the even loop, it appeared something else was not being cleaned up.

As per @s-ludwig 's instruction, I did this:

  auto l = listenHTTP(settings, router);

  runEventLoop();

  l.stopListening;

This was what I got before:

main thread exit
Event loop exit reason (exit flag=false): exited
Event loop done (scheduled tasks=0, waiters=1, thread exit=false).
Thread exit main (index 0) (main=true)
Main thread exiting
Warning (thread: main): leaking eventcore driver because there are still active handles
   FD 9 (streamListen)
Warning (thread: main): leaking eventcore driver because there are still active handles
   FD 9 (streamListen)

This is what I get now:

main thread exit
Event loop exit reason (exit flag=false): exited
Event loop done (scheduled tasks=0, waiters=1, thread exit=false).
[main(----) INF] Stopped to listen for HTTP requests on :::8089
Thread exit main (index 0) (main=true)
Main thread exiting

So it looks like that was the problem. Event loop exit doesn't seem to close the listening socket created in listenHTTP.

dedupely avatar Aug 01 '19 20:08 dedupely

I'm having a similar issue. I've just now tried to make sure all sockets are closed so I don't get this error message. I believe I'm handling all the database connections, and I'm on 1.7.0 of vibe-core. When I look at the trace output, the sockets that are left are the streamSockets used by the web server to handle incoming requests. Looking at the code, it appears they should be closed, but there's no trace to say a socket is closed.

schveiguy avatar Sep 20 '19 21:09 schveiguy

Just to clarify, my messages say streamSocket, NOT streamListen, and I added the stopListening call, it doesn't fix it.

schveiguy avatar Sep 20 '19 21:09 schveiguy

I still get this as well, but I'm using a number of different services like phobos tcp sockets, postgres and beanstalkd. Any one of these could be contributing. However, I've noticed I get it less running the stopListening at the end. Honestly I don't know enough about Vibe.d under the hood to really make educated guesses on this.

dedupely avatar Sep 23 '19 14:09 dedupely

The underlying issue of WHY this happens is that the event driver is cleaning up synchronously, but items that will be cleaned up by the GC are still holding resources. Since the GC is the last thing that cleans up, the event driver has to leak its resources to avoid segfaults. Before these messages you'd get a segfault every close. But I would expect the core vibe-d fibers to properly release sockets when they are done.

schveiguy avatar Sep 23 '19 19:09 schveiguy

More info: it's the redis session store that is not closing. I'm "cheating" by using tupleof so I can close the driver sockets, but they still leak (I'm using instrumented code that prints out when sockets are closed). Will continue to investigate, I'm hoping if I fix this problem, the messages go away. Note that if I close the app while there are still keepalive sockets open, those will leak. I'd like to see those go away too at shutdown, or at least have a way to wait for them to close. But if you do wait for the timeout, they don't leak.

schveiguy avatar Sep 24 '19 20:09 schveiguy

OK, FINALLY! I figured out why this is happening.

The redis session store doesn't provide a way to release unused connections, so as I said, I was cheating and using tupleof to get the ConnectionPool out of the RedisClient.

Then I was calling removeUnused to clear out all the connections still open. Each RedisConnection is a class, so there is no problem reference-wise there BUT, it only provides property access to the TCPConnection via conn(). I was calling close on that, thinking it would release the resource, but there was still a reference to the socket inside the class itself! THAT wasn't being cleaned up until the GC ran, which is why the driver was leaking.

The solution here is to expose the removeUnused functionality so the session store can be properly cleaned up before shutdown.

schveiguy avatar Sep 25 '19 20:09 schveiguy

This is redis-specific though. So the simple programs shown above exhibit a different issue ?

Geod24 avatar Sep 26 '19 01:09 Geod24

So the examples above seem to come from not closing down the listening socket (which it seems like folks have already fixed). But in general, ANY class-based thing that holds onto a descriptor may leak the driver. The only way to fix is to provide cleanup mechanisms that can be called on program termination, and to have them called. Aside from the redis issue, I need to also update mysql-native so it has a way to do this. Because it seems most of these wrappers hide a private ConnectionPool, it makes things difficult to call the cleanup function directly, and honestly, I'd rather have it implemented correctly in one place.

What WOULD be nice is if the call leaking the driver would identify the way that descriptor was used. For example, if it's a file, say "last used to open /x/y/z" or if it's a socket, "last used as a TCP connection to X.X.X.X:Y", to help narrow down where it's leaking from. As with most leaks, it's easy to tell something has leaked, but really difficult to tell where it leaked from. Having those clues can save hours of work (as it did take me hours to instrument everything to find the right place it was opened). Maybe an eventcore option to store the appropriate data for describing the source of the leak.

I also just realized, the messages are all being printed twice. Not sure why that is.

schveiguy avatar Sep 26 '19 15:09 schveiguy

Added a PR for fixing the redis issue.

schveiguy avatar Sep 27 '19 13:09 schveiguy

I just experienced this with the "Hello World!" example. This time I'm playing with Vibe.d on FreeBSD (12.1). I'm pretty sure I ran into it many months ago as well, that time when playing with Vibe.d on Gentoo Linux. That was likely also in the "Hello World!" example.

To reproduce the problem, in brief:

  • Make a new project using dub init <project-name> -t vibe.d as described at the top of the documentation page.
  • cd path/to/project
  • Optionally, add "versions": [ "VibeHighEventPriority" ] to dub.json, or versions "VibeHighEventPriority" to dub.sdl, as described in vibe-d/vibe-core issue #205.
    • The bug described here still happens without doing this, but it'll save you a few minutes of futile hammering-of-the-ctrl-c before pulling out a bigger hammer and running kill -9 $vibe_testing_pid, and thus you'll get cleaner output that stands a better chance of matching what I quote below.
  • Build the thing: dub build -b debug
  • Run it, then hit Ctrl-C:
$ ./vibe-testing 
[main(----) INF] Listening for requests on http://[::1]:8080/
[main(----) INF] Listening for requests on http://127.0.0.1:8080/
[main(----) INF] Please open http://127.0.0.1:8080/ in your browser.
^C[main(----) INF] Received signal 2. Shutting down.
Warning (thread: main): leaking eventcore driver because there are still active handles
   FD 8 (streamListen)
   FD 9 (streamListen)
Warning (thread: main): leaking eventcore driver because there are still active handles
   FD 8 (streamListen)
   FD 9 (streamListen)

Thankfully, on this system and with this config (FreeBSD 12.1; w/ VibeHighEventPriority if it matters) the bug/warning doesn't prevent subsequent runs of the test executable. I can ctrl-C and rerun immediately over and over again to my heart's content. It can even restart effectively instantly after I visit the "Hello World!" webpage. In my case, It just looks disturbing, because of the warning message.

The source/app.d contents I had at the start, with bug present as described above, looked like this:

import vibe.vibe;

void main()
{
        auto settings = new HTTPServerSettings;
        settings.port = 8080;
        settings.bindAddresses = ["::1", "127.0.0.1"];
        listenHTTP(settings, &hello);

        logInfo("Please open http://127.0.0.1:8080/ in your browser.");
        runApplication();
}

void hello(HTTPServerRequest req, HTTPServerResponse res)
{
        res.writeBody("Hello, World!");
}

I made it go away by modifying it according to @s-ludwig 's suggestion and @dedupely 's example earlier in this thread:

import vibe.vibe;

void main()
{
        auto settings = new HTTPServerSettings;
        settings.port = 8080;
        settings.bindAddresses = ["::1", "127.0.0.1"];
        auto listener = listenHTTP(settings, &hello);
        scope(exit)
                listener.stopListening();

        logInfo("Please open http://127.0.0.1:8080/ in your browser.");
        runApplication();
}

void hello(HTTPServerRequest req, HTTPServerResponse res)
{
        res.writeBody("Hello, World!");
}

Note the auto listener = listenHTTP(settings, &hello); and scope(exit) listener.stopListening();.

With that, the output becomes:

$ ./vibe-testing 
[main(----) INF] Listening for requests on http://[::1]:8080/
[main(----) INF] Listening for requests on http://127.0.0.1:8080/
[main(----) INF] Please open http://127.0.0.1:8080/ in your browser.
^C[main(----) INF] Received signal 2. Shutting down.
[main(----) INF] Stopped to listen for HTTP requests on ::1:8080
[main(----) INF] Stopped to listen for HTTP requests on 127.0.0.1:8080

Yay! No more leaking eventcore drivers.

I suspect that this particular manifestation can be fixed with documentation and example changes. The idiomatic code I see in the example above and on the documentation webpage seems to follow the same pattern (omitting the listener assignment statement and later .stopListening() call), and this pattern is requisite for this problem to happen. I suggest using something like the modified example above as the default example. Then it will not only copy, paste, and run without warning messages (and possible blocked sockets), but it shows learners that the object returned from listenHTTP holds network resources (it's all in the "stopListening" name) as well as how to release those resources.

I'd try to provide a PR, but I already looked for the source of that app.d "Hello World!" example and couldn't find it quickly. I hope it at least helps to know that this can happen in basic examples.

Thank you all.

chadjoan avatar Jun 06 '20 15:06 chadjoan

The underlying issue here is that one does not get detailed information about what handles are leaking, and when they were opened. It would be huge to be able to identify what the handles were doing. FD 8 and FD 9 is useless information. streamListen is useful, but not entirely descriptive. I'd like to see at a minimum all the information that vibe-core knows about it (i.e. if it's a socket, what type (TCP/UDP), IP address/port it's connected to, if possible when it was opened).

It would be even better to get a stack trace from the place the last time the socket was used (enabled only with a certain flag of course). I think this should be doable.

The root cause of all these is -- you didn't release a resource. But finding out which resource and why it was opened is very difficult from the messages given.

I also agree the default vibe-d server should clean up itself properly.

schveiguy avatar Jun 06 '20 15:06 schveiguy

I... totally agree and also really like good error messages.

I hope I didn't distract from that. I just wanted to provide an easy (I hope) solution for an adoption pitfall that arises as one consequence of the resource management and error reporting issues.

By that logic, it seems that this could probably spawn a few separate issues, in addition to the redis one that you handled: fix idioms in documentation, more specific+useful error messages, error messages are printed twice, less error-prone resource management (scope or ref-count non-memory resources by default?), and maybe more. I worry that I would do a poor job of writing those because I lack specifics and am not very familiar with vibe.d (yet).

chadjoan avatar Jun 06 '20 16:06 chadjoan

In my testing, scope(exit)... sometimes, but not always, fixes the problem. In some cases, after ctl-c the other thread cleans itself up and closes. In other cases, it does not.

jblachly avatar Aug 12 '20 19:08 jblachly

Hi all. thanks for the helpful suggestions.

Full disclosure, I'm not that good at coding. intermediate at best.

What i did notice on my setup though (Catalina, fresh D install, new vibe-d template project) is that NO MATTER WHAT I DO - scope(exit) or otherwise - i can only rerun the vibe-d application after refreshing my web browser one time.

Steps to reproduce:

  1. Launch vibe-d app
  2. Ctrl-C
  3. Attempt to rerun multiple times - unsuccessfully.
  4. Switch to - or open new - web browser.
  5. Navigate to or refresh localhost:8080 - App appears JUST ONCE in browser even AFTER Ctrl-C (you read correctly)
  6. After the app appears that ONE TIME in the browser, vibe-d app successfully restarts

With scope(exit) listener.stopListening() I get the "Stopped listening" log ONLY after ONE browser refresh following Ctrl-C.

Without scope(exit) i do not receive that log message at all.

I don't know much about whats going on architecturally but it seems that a remnant of the app is sitting somewhere after Ctrl-C is pressed and remains until at least one request is made for/of that remnant.

I'm not sure if any of this is helpful as I currently only speak noob, but it reproduces every time...

kinexis-uk avatar Nov 15 '20 00:11 kinexis-uk

I also came across this problem and wrote about it on DLang forums [1]. This is the toy example [2] I've been working on to test.

[1] https://forum.dlang.org/post/[email protected] [2] https://github.com/SelimOzel/vibe_noLeaks

SelimOzel avatar Jan 02 '21 18:01 SelimOzel