django-mailbox
django-mailbox copied to clipboard
IMAP: Crash can cause Message to be duplicated and processed multiple times.
The IMAP transport protocol mailbox fetch works like this:
- Get all message IDs (even ones marked \Deleted.)
- For each message ID: a. fetch the email b. call the receive signal c. mark it \Deleted
- expunge all \Deleted.
We are experiencing the same message being downloaded multiple times, in situations where our polling mechanism crashes in the middle of looping (in our case, due to many messages taking too long and Celery sigkilling the process).
This seems unnecessary. I am no IMAP expert, but from reviewing the IMAP RFC, it seems we could either move the expunge into the loop so it happens after each \Deleted mark, or else change the _get_all_message_ids
from:
response, message_ids = self.server.uid('search', None, 'ALL')
to
response, message_ids = self.server.uid('search', None, 'UNDELETED')
Or both... thoughts?
Any update on this? I have also experienced the same issue.
Every <email.message.Message object> have ['message-id'] header. And it store in model Message. I think setting to avoid duplicate can be added. For example add this validation to django_mailbox/models/process_incoming_message