offlineimap icon indicating copy to clipboard operation
offlineimap copied to clipboard

introduce hook to track down any change to local Maildirs

Open avar opened this issue 9 years ago • 13 comments

Some programs like "mu-index" will consume the Maildir that offlineimap writes out. Currently many of them have to scan the entire Maildir on every invocation to find what files have changed.

It would be much more efficient if offlineimap could write a plain text file like:

NEW <path>
MODIFIED <path>
CHANGED <path>

And these other tools could simply atomically move that file out of the way (offlineimap would have to re-open the file for every write) and consume the file list.

If you're not interested in hacking this up I'd be interested in doing it myself with my limited experience of how offlineimap works, pointers about where in the code to put this would be most welcome.

avar avatar Aug 02 '16 20:08 avar

You might like to check the Maildir driver offlineimap/folder/Maildir.py.

nicolas33 avatar Aug 02 '16 20:08 nicolas33

Some programs like "mu-index" will consume the Maildir that offlineimap writes out. Currently many of them have to scan the entire Maildir on every invocation to find what files have changed.

They should use inotify or similar feature available.

I would rather think about some more generic solution like hooks that could invoke python code like:

on_new_email = /path/to/python/file.py

In that case, please create new bug report and we can discuss this further.

dolohow avatar Aug 08 '16 14:08 dolohow

There is already the newmail_hook feature. I agree this is currently limited and it might be interesting to deprecate this in favor of a new configuration option (newmail_hook_eval?) calling Python code to get more flexibility. However, I'm not sure the filenames are known at that time.

"New mail" is a bit ambigous because of the two way sync.

nicolas33 avatar Aug 08 '16 22:08 nicolas33

I agree with @dolohow that this specific feature is not generic enough.

nicolas33 avatar Aug 08 '16 22:08 nicolas33

Yeah something hookable would be great. For what it's worth it's not just "new" mail, it's anything that currently would result in rw file operation in the Maildir, i.e.:

  • Addition
  • Modification
  • Deletion
  • Rename?

I don't know what the finite list would be.

But yeah, having a hook that would just get fed the filename for "I'm about to change this" would be great. Or actually, a design where the hook gets passed a function that'll do the actual work which it should call would probably be better, since then the hook can decide if it's only going to action on to-be-done operations, or known to be completed ones.

avar avatar Aug 08 '16 22:08 avar

Yeah something hookable would be great. For what it's worth it's not just "new" mail, it's anything that currently would result in rw file operation in the Maildir, i.e.:

Once again, if application wants to observe that kind of operations, it should use fs specific features like aforementioned inotify. This is why it was created in the first place.

dolohow avatar Aug 09 '16 07:08 dolohow

@dolohow For what it's worth I thought about the inotify case, I've written inotify daemons myself.

It's really a huge PITA to use something like that to solve such a simple case, I'll look into Maildir.py as @nicolas33 suggested to see if I can't do this with some FS abstraction layer, which would have the nice side-effect of making the Maildir core more pluggable for other things.

Why is inotify a PITA, because:

  • Right now neither offlineimap nor the thing I have in mind to consume this data run as a daemon.
  • So now I'd need to run a persistent program that's not offlineimap or my indexer to manage state between the two.
  • Inotify daemons need to run all the time, and they need to be highly available, e.g. if you don't consume kernel events quickly enough they're lost, so now in addition to consuming new events you need to run some directory traversal in the background to fix your internal state. See the watchman source code for an example of this
  • You're effectively getting a firehose of IO-related system calls which you then have to make sense of, as opposed to app-specific callbacks saying "this one's a new mail file"

avar avatar Aug 09 '16 08:08 avar

Once again, if application wants to observe that kind of operations, it should use fs specific features like aforementioned inotify. This is why it was created in the first place.

@dolohow While it's a good "workaround" or a good approach in this case, I'm fine with a new feature in offlineimap because inotify is not available for all the platforms offlineimap can run on.

nicolas33 avatar Aug 09 '16 12:08 nicolas33

@avar If you're still interested, I'll make a patch to get you started.

nicolas33 avatar Nov 06 '16 02:11 nicolas33

@nicolas33 That would be great, no promises when I'll get around to finishing this up, but just having some general hint of where to get started would be nice.

avar avatar Nov 06 '16 17:11 avar

Here's a start. I've made a quick mapping between the real changes in the Maildir and a Notify object.

All of this code must be taken cautiously and double checked since I didn't even make a basic re-reading.

Notice this won't track the changes on the folders structure because this is handled far away in the codebase. Depending on the needs, this might be reverse engineered from the data we take from the MaildirFolder (low impact on the codebase) or fully integrated (high impact).

Have fun!

nicolas33 avatar Nov 06 '16 22:11 nicolas33

Branch rebased on top on next. https://github.com/nicolas33/offlineimap/tree/b367 Does this patch help?

nicolas33 avatar Nov 23 '16 16:11 nicolas33

Thanks for the patch. I probably won't pick this up any time soon, but it's on my TODO list for stuff to hack on. Will be very useful if/when I get around to it, thanks!

avar avatar Nov 23 '16 19:11 avar