libinotify-kqueue icon indicating copy to clipboard operation
libinotify-kqueue copied to clipboard

Fix hangs on FreeBSD

Open ghost opened this issue 9 years ago • 11 comments

We're trying to create an fsnotifier version for FreeBSD/OpenBSD's IntelliJ ports with libinotify but are running into trouble when trying to stress test it. It hangs in an sbwait state when trying to watch, then unwatch a lot of files. See https://github.com/idea4bsd/fsnotifier/issues/1.

I fixed it by adding the SOCK_NONBLOCK option to socketpair. I'm not sure if this a proper fix. Let me know if this is ok.

The test suite passes mostly. Only tests that also fail in https://github.com/dmatveev/libinotify-kqueue/issues/33 don't. So I guess it is ok they fail here too.

In test "Directory notifications":
    failed: receive IN_MOVED_FROM event on moving file from directory to another location within the same mount point
    failed: receive IN_MOVED_TO event on moving file to directory from another location within the same mount point

In test "Open/close notifications":
    failed: receive IN_OPEN on cat
    failed: receive IN_CLOSE_NOWRITE on cat
    failed: receive IN_OPEN on ls
    failed: receive IN_CLOSE_NOWRITE on ls
    failed: receive IN_OPEN on modify
    failed: receive IN_CLOSE_WRITE on modify

In test "Symbolic links":
    failed: Start watch successfully on a symlink file with IN_DONT_FOLLOW
    failed: Receive IN_ATTRIB after touching symlink itself
    failed: Receive IN_MOVE_SELF after moving the symlink
    failed: Receive IN_DELETE_SELF after removing the symlink

--------------------
     Run: 119
  Passed: 107
  Failed: 12

ghost avatar Feb 21 '16 01:02 ghost

Hi Tobias! Thanks for report.

How can I reproduce the problem on my side (I have NetBSD only)? You mentioned a large number of files, how large it should be?

Can you please specify sequence of inotify calls you do to reproduce it, so I could code up an example on my side and have a look on behavior?

dmatveev avatar Feb 21 '16 09:02 dmatveev

I'm unsure. I've ran fsnotifier's self-test over FreeBSD's ports tree which has ~ 300000 files, but it also hangs with trees with ~ 1000 files. Here is a minimal test program I came up with that has the same problem without the patch:

#include <sys/inotify.h>
#define _WITH_GETLINE
#include <stdio.h>
#include <string.h>
#include <err.h>

#define EVENT_MASK IN_MODIFY | IN_ATTRIB | IN_CREATE | IN_DELETE | IN_MOVE | IN_DELETE_SELF | IN_MOVE_SELF

#define FILES_TO_WATCH 1000

int main(int argc, char **argv) {
    char *line = NULL;
    size_t linecap = 0;
    ssize_t linelen;

    int wd[FILES_TO_WATCH];
    int fd = inotify_init();
    int n = 0;

    while(getline(&line, &linecap, stdin) > 0 && n < FILES_TO_WATCH) {
        line[strlen(line) - 1] = '\0';
        printf("Watching %i\n", n + 1);
        wd[n] = inotify_add_watch(fd, line, EVENT_MASK);
        if(wd[n] < 0)
            err(1, "inotify_add_watch");
        n++;
    }

    // It'll always block here at file 513
    for(int i = 0; i < n; i++) {
        printf("Unwatching %i/%i\n", i + 1, n);
        if(inotify_rm_watch(fd, wd[i]))
            err(1, "inotify_rm_watch");
    }

}

Compile with cc -o block block.c -linotify -I/usr/local/include -L/usr/local/lib and run it like find /usr/pkgsrc | ./block.

ghost avatar Feb 21 '16 10:02 ghost

Thank you so much!

dmatveev avatar Feb 21 '16 10:02 dmatveev

I fixed it by adding the SOCK_NONBLOCK option to socketpair. I'm not sure if this a proper fix. Let me know if this is ok.

You are going the right way but just adding NONBLOCK option to socket is not enough. Problem is that internal inotify event queue is flushed unconditionally after every worker`s round so socket buffer can be overflown and block worker thread. Right solution is to check available space with kqueue EVFILT_WRITE event and write limited amount of events to socket when this space exists. Also we should drop queue on overflow and insert IN_Q_OVERFLOW inotify event here in this case.

wulf7 avatar Feb 21 '16 23:02 wulf7

I wrote some code and put it in my git repo: https://github.com/wulf7/libinotify-kqueue but it need some testing before pull-request

wulf7 avatar Feb 23 '16 21:02 wulf7

I forcepushed slightly better variant to https://github.com/wulf7/libinotify-kqueue just now.

wulf7 avatar Feb 24 '16 22:02 wulf7

Thank you. I can confirm that your code works. fsnotifier's self-test completes now when run over the entire ports tree. Is there anything else I can test?

ghost avatar Feb 26 '16 13:02 ghost

Sorry guys for delay from my side, I am pretty overloaded at the moment.

Vladimir, great done as usual! So what I should merge?

dmatveev avatar Feb 26 '16 15:02 dmatveev

Tnx, guys for cooperation I`ve made a pull request https://github.com/dmatveev/libinotify-kqueue/pull/35

wulf7 avatar Feb 27 '16 12:02 wulf7

To Tobias. I have not looked in fsnotifier code but if you want to remove all watches from inotify instance, the simplest (and in many cases the best) way to achieve that is just a reopening of instance

close(fd);
fd = inotify_init();

wulf7 avatar Feb 29 '16 21:02 wulf7

Thanks. AFAICT removing all watches really only happens during the self-test and not during normal operation. I don't think this is an option but I'll have to look into fsnotifier's code some more.

ghost avatar Feb 29 '16 22:02 ghost