watchfiles
watchfiles copied to clipboard
Tracking file rename
In Jupyter we have a new service that allows to track files when they are renamed. It currently implements most of the logic, but I was wondering if this is something that watchfiles could support. I know Notify doesn't support that feature, but maybe this could be a nice addition to watchfiles?
Notify does support renaming with some backends, see here, I think some back-ends (e.g. polling) don't support it but most do.
If you run watchfiles in debug mode, you'll see the raw events and be able to see what we get to process (e.g. run watchfiles 'echo reloaded' . --verbose
).
I guess there are two options for watchfiles:
- yield rename events when we get them, but fall back to just "deleted, added" events otherwise
- or (less good I think), process the simplified stream of events that we produce now and try to identify renaming
I guess the third option would be to combine the two, but I think that's too complicated - if we really wanted rename events with backends that don't natively support it, I'd rather that was implemented in notify, there was even some discussion of this on https://github.com/notify-rs/notify/issues/20 and maybe in other issues.
This might also required better event ordering, see https://github.com/samuelcolvin/watchfiles/issues/148#issuecomment-1289586991 - if the theory prosited on #148 that the order problem is caused by use of hashset, it should be pretty easy to fix.
- yield rename events when we get them, but fall back to just "deleted, added" events otherwise
Yes, I think we should take advantage of Notifiy's information about renaming when it is available.
- or (less good I think), process the simplified stream of events that we produce now and try to identify renaming
I guess the third option would be to combine the two, but I think that's too complicated - if we really wanted rename events with backends that don't natively support it, I'd rather that was implemented in notify, there was even some discussion of this on notify-rs/notify#20 and maybe in other issues.
I agree that it would be better if it were implemented in Notify, but maybe watchfiles could go a bit further and for instance support a file tracking mode, where you explicitly pass a file that you want to track. Then it could offer several strategies to try and track this file, the most brute-force being identifying files by content. By reducing the scope of events to just this file, it might not be too expensive. I think this kind of things would likely not be accepted in Notify, and easier to implement in watchfiles?
Maybe, but it sounds complicated in watchfiles too :smile:.
Let's start by fixing #148 and adding a Rename
option to the change we yield and see how well that works.
Another question: this would be a breaking change.
How do you feel about releasing what we have on main now as V1 (see #186), then any breaking changes to support renaming would be released as V2? I'm well aware that watchfiles is should be at v1 by now.
Let's start by fixing #148 and adding a
Rename
option to the change we yield and see how well that works.
:+1:
How do you feel about releasing what we have on main now as V1 (see #186), then any breaking changes to support renaming would be released as V2? I'm well aware that watchfiles is should be at v1 by now.
I agree with that, release main
as v1 and support renaming in v2.
New plan at https://github.com/samuelcolvin/watchfiles/pull/202#issuecomment-1305468800.
If anyone wants to take this on, feel free.
In https://github.com/jupyter-server/jupyverse/pull/244 I implemented the tracking of file renames using Change.added
, Change.deleted
and modification times.
I'm quite happy with it. I guess implementing this feature in Rust is not really needed anymore for me.
ok, I still think it would still be a good idea here, but good to know it's not urgent.
As a consumer, simply getting a Change.renamed
event would not be useful to me unless I knew what the file was renamed from as well as to. The current return from watch provides only the change
event and a [final state] file path
. Giving callers to watch
access to (a perhaps normalized version of) the underlying raw event data would solve the problem for me. I'm running on Windows and when I add a file I get a Change.added
event where the raw even Kind is Create(Any)
, followed by a Change.modified
event where the raw even Kind is Modified(Any)
. When I rename a file I get a Change.deleted
event where the raw event Kind is Modify(Name(From))
, followed by a Change.added
event where the raw even Kind is Modify(Name(To))
.
If I had access to the raw event Kind, I would be able to detect rename events and even know the 'from' and 'to' paths without changing the existing change
and path
returned by watch
. (Unless there is an ordering problem with the events returned by watch
. In that case, correlating the delete/add events would be difficult.)
I am looking for new files added to a directory and, in its current state, I cannot use watchfiles
because a rename looks like an add. It's a shame too. I had been using the watchdog
package but it proved unreliable in monitoring UNC paths when a remote computer is restarted. watchfiles
successfully resumes reporting events when the remote computer restarts.