docspell icon indicating copy to clipboard operation
docspell copied to clipboard

Provide a (better) bulk import of mails

Open eikek opened this issue 4 years ago • 4 comments

For context: #551

The scan-mailbox-task is currently designed to work for periodically importing "some" mails. In contrast there is the desire to import "all" mails of a mail box for bootstrapping. This is not periodic, but rather a one-time shot. Either the scanmailbox task can be improved/extended or a different task could be provided.

eikek avatar Jan 09 '21 13:01 eikek

The problem with the current solution is that it is required to move seen mails out of the way (or adjust the filter to not select them again). Some workaround ideas that can be used today:

  • Move all mails in your mail client into some folder and let docspell import this folder while moving mails back to Inbox (or whatever original folder) after processing
  • Change scan-mailbox.mail-chunk-size and scan-mailbox.max-mails in the config file to some high(er) value. Then start the normal scan-mailbox-task.
  • Use mail sync tools, for example offlineimap or mbsync to download all mail to disk and then upload all files via the consumedir script

eikek avatar Jan 09 '21 13:01 eikek

  • Move all mails in your mail client into some folder and let docspell import this folder while moving mails back to Inbox (or whatever original folder) after processing

+1 for this

  • Change scan-mailbox.mail-chunk-size and scan-mailbox.max-mails in the config file to some high(er) value. Then start the normal scan-mailbox-task.

Does this mean change it to = the total number of emails in a folder?

voarsh2 avatar Mar 17 '21 05:03 voarsh2

Does this mean change it to = the total number of emails in a folder?

Yes, this should work, but I've never tried that with a lot of mails myself.

eikek avatar Mar 17 '21 19:03 eikek

I don't know if it's just me, but bulk-importing mail to me sounds like it's easier done exporting EML files to a temporary filesystem folder anyway, running sanity checks/cleanups, and then using dsc to import, explicitly adding metadata from the mails which you know exists, rather than assuming the system might maybe possibly could find them, if you know what I mean.

madduck avatar Sep 08 '23 22:09 madduck