watchfiles icon indicating copy to clipboard operation
watchfiles copied to clipboard

Tracking file rename

Open davidbrochart opened this issue 2 years ago • 8 comments

In Jupyter we have a new service that allows to track files when they are renamed. It currently implements most of the logic, but I was wondering if this is something that watchfiles could support. I know Notify doesn't support that feature, but maybe this could be a nice addition to watchfiles?

davidbrochart avatar Nov 03 '22 09:11 davidbrochart

Notify does support renaming with some backends, see here, I think some back-ends (e.g. polling) don't support it but most do.

If you run watchfiles in debug mode, you'll see the raw events and be able to see what we get to process (e.g. run watchfiles 'echo reloaded' . --verbose).

I guess there are two options for watchfiles:

  • yield rename events when we get them, but fall back to just "deleted, added" events otherwise
  • or (less good I think), process the simplified stream of events that we produce now and try to identify renaming

I guess the third option would be to combine the two, but I think that's too complicated - if we really wanted rename events with backends that don't natively support it, I'd rather that was implemented in notify, there was even some discussion of this on https://github.com/notify-rs/notify/issues/20 and maybe in other issues.

This might also required better event ordering, see https://github.com/samuelcolvin/watchfiles/issues/148#issuecomment-1289586991 - if the theory prosited on #148 that the order problem is caused by use of hashset, it should be pretty easy to fix.

samuelcolvin avatar Nov 03 '22 10:11 samuelcolvin

  • yield rename events when we get them, but fall back to just "deleted, added" events otherwise

Yes, I think we should take advantage of Notifiy's information about renaming when it is available.

  • or (less good I think), process the simplified stream of events that we produce now and try to identify renaming

I guess the third option would be to combine the two, but I think that's too complicated - if we really wanted rename events with backends that don't natively support it, I'd rather that was implemented in notify, there was even some discussion of this on notify-rs/notify#20 and maybe in other issues.

I agree that it would be better if it were implemented in Notify, but maybe watchfiles could go a bit further and for instance support a file tracking mode, where you explicitly pass a file that you want to track. Then it could offer several strategies to try and track this file, the most brute-force being identifying files by content. By reducing the scope of events to just this file, it might not be too expensive. I think this kind of things would likely not be accepted in Notify, and easier to implement in watchfiles?

davidbrochart avatar Nov 03 '22 11:11 davidbrochart

Maybe, but it sounds complicated in watchfiles too :smile:.

Let's start by fixing #148 and adding a Rename option to the change we yield and see how well that works.

Another question: this would be a breaking change.

How do you feel about releasing what we have on main now as V1 (see #186), then any breaking changes to support renaming would be released as V2? I'm well aware that watchfiles is should be at v1 by now.

samuelcolvin avatar Nov 03 '22 12:11 samuelcolvin

Let's start by fixing #148 and adding a Rename option to the change we yield and see how well that works.

:+1:

How do you feel about releasing what we have on main now as V1 (see #186), then any breaking changes to support renaming would be released as V2? I'm well aware that watchfiles is should be at v1 by now.

I agree with that, release main as v1 and support renaming in v2.

davidbrochart avatar Nov 03 '22 12:11 davidbrochart

New plan at https://github.com/samuelcolvin/watchfiles/pull/202#issuecomment-1305468800.

If anyone wants to take this on, feel free.

samuelcolvin avatar Nov 07 '22 12:11 samuelcolvin

In https://github.com/jupyter-server/jupyverse/pull/244 I implemented the tracking of file renames using Change.added, Change.deleted and modification times. I'm quite happy with it. I guess implementing this feature in Rust is not really needed anymore for me.

davidbrochart avatar Nov 22 '22 14:11 davidbrochart

ok, I still think it would still be a good idea here, but good to know it's not urgent.

samuelcolvin avatar Nov 22 '22 15:11 samuelcolvin

As a consumer, simply getting a Change.renamed event would not be useful to me unless I knew what the file was renamed from as well as to. The current return from watch provides only the change event and a [final state] file path. Giving callers to watch access to (a perhaps normalized version of) the underlying raw event data would solve the problem for me. I'm running on Windows and when I add a file I get a Change.added event where the raw even Kind is Create(Any), followed by a Change.modified event where the raw even Kind is Modified(Any). When I rename a file I get a Change.deleted event where the raw event Kind is Modify(Name(From)), followed by a Change.added event where the raw even Kind is Modify(Name(To)).

If I had access to the raw event Kind, I would be able to detect rename events and even know the 'from' and 'to' paths without changing the existing change and path returned by watch. (Unless there is an ordering problem with the events returned by watch. In that case, correlating the delete/add events would be difficult.)

I am looking for new files added to a directory and, in its current state, I cannot use watchfiles because a rename looks like an add. It's a shame too. I had been using the watchdog package but it proved unreliable in monitoring UNC paths when a remote computer is restarted. watchfiles successfully resumes reporting events when the remote computer restarts.

nkmhor avatar Sep 29 '23 22:09 nkmhor