IMAPdedup
IMAPdedup copied to clipboard
Folders with a large number of messages cause an error
Hi,
IMAPdedup is a great tool! Thank you for your awesome work.
Folders with a large number of messages cause an error when run on my Synology 920+ (20GB RAM). Please see below.
I hope, that you can debug this. Thank you!
Tom.
=== CLI Output ===
There are 246410 messages in X_Folder. No message(s) currently marked as deleted in X_Folder Traceback (most recent call last): File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1022, in _command_complete typ, data = self._get_tagged_response(tag, expect_bye=logout) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1148, in _get_tagged_response self._get_response() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1050, in _get_response resp = self._get_line() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1158, in _get_line line = self.readline() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 316, in readline raise self.error("got more than %d bytes" % _MAXLINE) imaplib.error: got more than 1000000 bytes
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 477, in process msgnums = get_undeleted_msgnums(server, options.sent_before) File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 329, in get_undeleted_msgnums return get_matching_msgnums(server, "UNDELETED", sent_before) File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 312, in get_matching_msgnums deleted_info = check_response(server.search(None, query)) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 725, in search typ, dat = self._simple_command(name, *criteria) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1205, in _simple_command return self._command_complete(name, self._command(name, *args)) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1026, in _command_complete raise self.error('command: %s => %s' % (name, val)) imaplib.error: command: SEARCH => got more than 1000000 bytes
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1022, in _command_complete typ, data = self._get_tagged_response(tag, expect_bye=logout) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1148, in _get_tagged_response self._get_response() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1079, in _get_response raise self.abort("unexpected response: %r" % resp) imaplib.abort: unexpected response: b
One more error
There are 45967 messages in 2013. No message(s) currently marked as deleted in 2013 45967 others in 2013 Traceback (most recent call last): File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1022, in _command_complete typ, data = self._get_tagged_response(tag, expect_bye=logout) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1148, in _get_tagged_response self._get_response() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1050, in _get_response resp = self._get_line() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1160, in _get_line raise self.abort('socket error: EOF') imaplib.abort: socket error: EOF
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 489, in process for mnum, hinfo in get_msg_headers(server, msgnums[i: i + chunkSize]): File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 352, in get_msg_headers ms = check_response(server.fetch(message_ids_str, "(RFC822.HEADER)")) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 539, in fetch typ, dat = self._simple_command(name, message_set, message_parts) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1205, in _simple_command return self._command_complete(name, self._command(name, *args)) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1024, in _command_complete raise self.abort('command: %s => %s' % (name, val)) imaplib.abort: command: FETCH => socket error: EOF
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1022, in _command_complete typ, data = self._get_tagged_response(tag, expect_bye=logout) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1148, in _get_tagged_response self._get_response() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1050, in _get_response resp = self._get_line() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1160, in _get_line raise self.abort('socket error: EOF') imaplib.abort: socket error: EOF
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 597, in
Hi Tom -
Ah, yes, this (first error) cropped up 6 years ago (#19). The standard Python imaplib library that I use has a limit on the length of the response it expects back from the server. In those days, it was 10KB; I see they've now increased it to 1MB, but that's clearly not enough for some of us!
You could try increasing the limit by adding a line somewhere after import imaplib
which says, say,
imaplib._MAXLINE = 5000000
and see if that helps.
The second one may simply be your email server not responding very fast, and there's probably less we can do about that.
One thing you could try (if this is only an occasional operation for you) is to split your big folder up into several smaller ones, and then run
./imapdedup.py ... folder1 folder2 folder3 folder4 ...
and then recombine them, if wanted, once the duplicates have been removed?
Unfortunately, the problem persists after adding imaplib._MAXLINE = 5000000.
The script processes come folders with >200k messages well and then produces an error processing a folder with 45k messages. The Synology is pretty fast, so I doubt it's the email server response time. The error is in the same folder as last time.
There are 45967 messages in 2013. No message(s) currently marked as deleted in 2013 45967 others in 2013 Traceback (most recent call last): ......
Splitting into multiple folders isn't an option, unfortunately.
Mmm. Strange.
Not much I can do there, I'm afraid, if imaplib can't cope with it; your server must be sending surprisingly long lines!
You can try increasing _MAXLINE further. Otherwise, this might be a question for the imaplib maintainers.
The only real way to get IMAPdedup to look at a subset of the messages is by using the --sentbefore option to restrict them by date. If you have lots of duplicates, you may be able to eliminate them in chunks this way...?
Update: The imaplib._MAXLINE = 5000000 seems to have solved the problem for most folders but one. I suppose the fault in in this folder then and not in the script. Thank you very much! I suggest to add imaplib._MAXLINE = 5000000 as a standard.
I had the same issue, but using the below worked.
imaplib._MAXLINE = 10000000
5,000,000 wasn't enough but 10,000,000 was.
Thanks!