notify Renaming on OS X yields raw-events that confuse debouncer

This is spawned off of https://github.com/rust-analyzer/rust-analyzer/issues/734#issuecomment-463403091

On Mac OS X, I am observing that when I rename a file from "foo" to "bar" within a (recursively) watched directory, the raw-events I get from raw watcher seem to vary wildly, which then causes the debounced events to appear to behave in a strange and unpredictable manner.

I have included a main.rs at the bottom (gist) that demonstrates the problem; it is meant to be plugged into a cargo crate that has notify = "4.0.9" as a dependency.

But I suspect more interesting than the code itself is the output it produces. I have seen essentially three output sequences from this code.

The full output from 11 runs is included at the same gist) as above. the full output also contains the events prior to the rename, which may be of interest for people investigating this, but I did not include them here because I want to make it clear what the problem with the Debounced Events are.)

The most egregious thing that I noticed is that it seems like the RawEvent will sometimes include the RENAME flag for "foo", but the resulting debounced event doesn't mention "foo" at all.

The output classes

Remove bar (!!!), no mention of foo

Occurred 9 out of 11 times

Renaming file: tmp/foo to tmp/bar
checking tmp/foo is gone
checking tmp/bar is present
checking tmp/bar has expected content
raw event: RawEvent { path: Some("/Users/fklock/Dev/Rust/ra-issue-734/tmp/foo"), op: Ok(CREATE | RENAME | WRITE), cookie: Some(4124192) }
raw event: RawEvent { path: Some("/Users/fklock/Dev/Rust/ra-issue-734/tmp/bar"), op: Ok(REMOVE | RENAME), cookie: Some(4124192) }
debounced event: NoticeRemove("/Users/fklock/Dev/Rust/ra-issue-734/tmp/bar")
debounced event: Remove("/Users/fklock/Dev/Rust/ra-issue-734/tmp/bar")

Create bar, no mention of foo

Occurred 1 out of 11 times

Renaming file: tmp/foo to tmp/bar
checking tmp/foo is gone
checking tmp/bar is present
checking tmp/bar has expected content
raw event: RawEvent { path: Some("/Users/fklock/Dev/Rust/ra-issue-734/tmp/foo"), op: Ok(CREATE | RENAME | WRITE), cookie: Some(4125655) }
raw event: RawEvent { path: Some("/Users/fklock/Dev/Rust/ra-issue-734/tmp/bar"), op: Ok(RENAME), cookie: Some(4125655) }
debounced event: Create("/Users/fklock/Dev/Rust/ra-issue-734/tmp/bar")

Rename foo to bar (what I expected expected)

Occurred 1 out of 11 times

Renaming file: tmp/foo to tmp/bar
checking tmp/foo is gone
checking tmp/bar is present
checking tmp/bar has expected content
raw event: RawEvent { path: Some("/Users/fklock/Dev/Rust/ra-issue-734/tmp/foo"), op: Ok(RENAME), cookie: Some(4124247) }
debounced event: NoticeRemove("/Users/fklock/Dev/Rust/ra-issue-734/tmp/foo")
raw event: RawEvent { path: Some("/Users/fklock/Dev/Rust/ra-issue-734/tmp/bar"), op: Ok(RENAME), cookie: Some(4124247) }
debounced event: Rename("/Users/fklock/Dev/Rust/ra-issue-734/tmp/foo", "/Users/fklock/Dev/Rust/ra-issue-734/tmp/bar")

The code

extern crate notify;

use notify::{Watcher as _};
use std::fs;
#[allow(unused_imports)]
use std::io::{self, Read as _};
use std::sync::mpsc;
use std::thread;
use std::time::Duration;

const WATCHER_DELAY: Duration = Duration::from_millis(100);

fn main() -> io::Result<()> {
    let root = "tmp";
    let from = &format!("{}/foo", root);
    let to = &format!("{}/bar", root);

    thread::sleep(Duration::from_secs(1));
    if let Some(metadata) = fs::metadata(root).ok() {
        assert!(metadata.is_dir());
    } else {
        println!("Creating directory: {}", root);
        fs::create_dir(root).ok();
    }

    match fs::metadata(from) {
        Err(ref err) if err.kind() == io::ErrorKind::NotFound => { }
        _ => panic!("remove {} before running test", from),
    }
    match fs::metadata(to) {
        Err(ref err) if err.kind() == io::ErrorKind::NotFound => { }
        _ => panic!("remove {} before running test", to),
    }

    let (raw_sender, raw_receiver) = mpsc::channel();
    let mut raw_watcher = notify::raw_watcher(raw_sender).unwrap();
    let (db_sender, db_receiver) = mpsc::channel();
    let mut db_watcher = notify::watcher(db_sender, WATCHER_DELAY).unwrap();
    raw_watcher.watch(root, notify::RecursiveMode::Recursive).unwrap();
    db_watcher.watch(root, notify::RecursiveMode::Recursive).unwrap();
    let _handle = thread::spawn(|| for event in raw_receiver {
        println!("raw event: {:?}", &event);
    });
    let _handle = thread::spawn(|| for event in db_receiver {
        println!("debounced event: {:?}", &event);
    });

    println!("Writing file: {}", from);
    fs::write(from, "Hello")?;
    thread::sleep(Duration::from_secs(1));

    println!("Renaming file: {} to {}", from, to);
    fs::rename(from, to)?;

    // Double-check everything happened as expected

    {
        println!("checking {} is gone", from);
        match fs::metadata(from) {
            Err(ref err) if err.kind() == io::ErrorKind::NotFound => { }
            _ => panic!("found {} even after renaming it?", from),
        }
    }

    {
        println!("checking {} is present", to);
        let _contents = {
            let mut c = String::new();
            fs::File::open(to)?.read_to_string(&mut c)?;
            c
        };
        {
            println!("checking {} has expected content", to);
            assert_eq!(_contents, "Hello");
        }
    }

    thread::sleep(Duration::from_secs(1));

    Ok(())
}

Feb 14 '19 12:02 pnkfelix

Hmm, skimming over the notify test suite, it seems that the tests there wait 35 seconds (!) after constructing the directory hierarchy for a rename test before building the watcher for the test.

Perhaps that is related to the problems I am observing here.

Feb 14 '19 13:02 pnkfelix

Yes, changing the thread::sleep(Duration::from_secs(1)); immediately after writing "Hello" to foo to sleep for 35s instead of 1s does seem to normalize the RawEvents so that we consistently see:

raw event: RawEvent { path: Some("/Users/fklock/Dev/Rust/ra-issue-734/tmp/foo"), op: Ok(RENAME), cookie: Some(4238156) }
debounced event: NoticeRemove("/Users/fklock/Dev/Rust/ra-issue-734/tmp/foo")
raw event: RawEvent { path: Some("/Users/fklock/Dev/Rust/ra-issue-734/tmp/bar"), op: Ok(RENAME), cookie: Some(4238156) }
debounced event: Rename("/Users/fklock/Dev/Rust/ra-issue-734/tmp/foo", "/Users/fklock/Dev/Rust/ra-issue-734/tmp/bar")

Feb 14 '19 13:02 pnkfelix

I'm on my Phone right now and I didn't look at the code in a long time, but don't forget that fsevents has a buffer of circa 30s to merge events for a file.

Feb 14 '19 13:02 dfaust

Ah thank you for that information, I did not know about that constraint (that fsevents can merge events within a 30s buffer). This explains the sleep_macos(35_000) calls I see peppered in the notify tests.

Before you posted, I was musing that one way the notify crate could improve the user experience here would be to try to clean up the event when its "obvious" that its dealing with the source and target of a rename. E.g. for any instance of fs::rename(old, new) that we "identify" by seeing two rename events whose id's differ by 1 unit, notify::fsevents could unconditionally remove any CREATE from the old and REMOVE from the new.

But I am now thinking that throwing away information like that would be a mistake.

Still, I believe it would be better for the debouncer to somehow communicate that there was an event involving old here. Maybe by adding a DebouncedEvent::NoticeRename that it could emit for the old? (i.e. in the examples above, when foo has a RENAME op, we would emit NoticeRename of foo?)

Feb 14 '19 14:02 pnkfelix

I'm also hitting this in an automated test that creates a file in a watched directory, then renames it. It seems like the debouncer should be able to produce better events based on the raw event stream reported above.

May 07 '19 21:05 nathansobo

(Uh, first can I say this it is exciting and also kinda scary to have this project be used by something as huge as Atom. Good scary, I think! Also: I haven't looked at this at all, and off-topic here, but if @atom/notify internally has a Node binding for Notify, I'd be interested in having that separately as an official/semi-official Notify thing.)

Secondly, there's two core issues here. They have interplay in this case, are both a lot of work, but it is the plan to get them solved.

FSEvent

FSEvent is not super good at this whole file watching thing. This is discussed at length in various issues and threads on this project. The TL;DR is a matter of design and intent: FSEvent is not designed and intended for the kind of file watching Notify users expect. (As far as we can tell, it's designed for backup and indexing, i.e. it was made for macOS's Spotlight and Time Machine.)

The better API to use on macOS is kevent. It is however much more low-level, and this project has exactly zero experience with it because the BSD backend (which would have used kevent as a matter of course) never materialised.

So fixing this "properly" requires building, testing, and integrating a whole new backend, adding a completely new platform to the testing array (which complicates things as there is no native BSD support in Travis), yada yada. Added to that, and this may have improved since the last time I looked (~last year), there is no kevent -sys low-level binding or wrapper that is maintained and works on both BSDs and macOS, so it's likely Notify would have to take on that, too.

So we're kinda stuck with a sub-optimal platform for now. We can work around it, but it's likely never going to be awesome at it.

Debouncing

As part of the upcoming release, of which you can see a summary in the draft changelog, I'm currently deep inside the debounce code to rewrite it for the new Events.

As I reacquaint myself with how everything works exactly... yes, the debouncing could be a lot better. It's doing good for a lot of cases, but I can see many ways in which it is a bit naive or heavy-handed and could definitely get better.

One thing it does in particular is it's pretty aggressive in how it handles renames, and it will rewrite renames as completely different events (removes and creates) if they fall "wrong" on the debouncing interval boundaries. Another thing that plays into that is it considers a very limited view of past events per path (almost always only the last event) to decide what to do.

Recognition of the problem existing is the first step, right? ^^

The debounce code is fairly complex. I mean, it's not the most complex code I work on, but it's up there. That may also be something that can be improved.

Priorities

The priority for me right now is to get 5.0 out. That will not include any kind of big changes to how debouncing works, if I can at all avoid it. There's already a lot of things happening and I want to have something out (I don't do ETAs with Notify, but this month would be great, haha) before tacking on more work.

However, this will likely be the immediate next thing. There's a lot of wins to be had in the debounce code. There's a bunch of potential performance gains there, too, notably around locks and clones. So that's where the focus will be.

Hope that's useful information!

May 08 '19 03:05 passcod

Hi @passcod. Thanks for your response and your work on this library. Cross-platform file system notifications is a really thorny problem and you've packaged it up really nicely!

@atom/notify internally has a Node binding for Notify

We're actually just spawning a small subprocess to avoid the complexity of an official binding. I actually have a pretty nice wrapper for N-API buried in the Xray repo's commit history, but it's a lot simpler to just build a plain Rust binary and spawn it.

The better API to use on macOS is kevent

I guess the downside of that is consuming a file descriptor for every directory like with inotify, right?

It's really good to hear that you're working on improving the debouncing code. I suspect that when I'm not hammering the file system rapidly in automated tests, that the current logic will already work quite well, but obviously it would be nice to know that we're reporting events as accurately as possible to consumers of our API.

We're going to enable it by default on our nightly and beta release channels and see what happens. Thanks again for your work ⚡️.

May 08 '19 18:05 nathansobo

notify notify copied to clipboard

Renaming on OS X yields raw-events that confuse debouncer

The output classes

Remove bar (!!!), no mention of foo

Create bar, no mention of foo

Rename foo to bar (what I expected expected)

The code

FSEvent

Debouncing

Priorities

notify
notify copied to clipboard