IMAPdedup
IMAPdedup copied to clipboard
Folders with a large number of messages cause an error
Hi,
IMAPdedup is a great tool! Thank you for your awesome work.
Folders with a large number of messages cause an error when run on my Synology 920+ (20GB RAM). Please see below.
I hope, that you can debug this. Thank you!
Tom.
=== CLI Output ===
There are 246410 messages in X_Folder. No message(s) currently marked as deleted in X_Folder Traceback (most recent call last): File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1022, in _command_complete typ, data = self._get_tagged_response(tag, expect_bye=logout) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1148, in _get_tagged_response self._get_response() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1050, in _get_response resp = self._get_line() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1158, in _get_line line = self.readline() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 316, in readline raise self.error("got more than %d bytes" % _MAXLINE) imaplib.error: got more than 1000000 bytes
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 477, in process msgnums = get_undeleted_msgnums(server, options.sent_before) File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 329, in get_undeleted_msgnums return get_matching_msgnums(server, "UNDELETED", sent_before) File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 312, in get_matching_msgnums deleted_info = check_response(server.search(None, query)) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 725, in search typ, dat = self._simple_command(name, *criteria) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1205, in _simple_command return self._command_complete(name, self._command(name, *args)) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1026, in _command_complete raise self.error('command: %s => %s' % (name, val)) imaplib.error: command: SEARCH => got more than 1000000 bytes
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1022, in _command_complete typ, data = self._get_tagged_response(tag, expect_bye=logout) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1148, in _get_tagged_response self._get_response() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1079, in _get_response raise self.abort("unexpected response: %r" % resp) imaplib.abort: unexpected response: b'58729 158730 158731 158732 158733 158734 158735 158736 158737 158738 158739 158740 158741 158742 158743 158744 158745 158746 158747 158748 158749 158750 158751 158752 158753 158754 158755 158756 158757 158758 158759 158760 158761 158762 158763 158764 158765 158766 158767 158768 158769 158770 158771 158772 158773 158774 158775 158776 158777 158778 158779 158780 158781 158782 158783 158784 158785 158786 158787 158788 158789 158790 158791 158792 158793 158794 158795 158796 158797 158798 158799 158800 158801 158802 158803 158804 158805 158806 158807 158808 158809 158810 158811 158812 158813 158814 158815 158816 158817 158818 158819 158820 158821 158822 158823 158824 158825 158826 158827 158828 158829 158830 158831 158832 158833 158834 158835 158836 158837 158838 158839 158840 158841 158842 158843 158844 158845 158846 158847 158848 158849 158850 158851 158852 158853 158854 158855 158856 158857 158858 158859 158860 158861 158862 158863 158864 158865 158866 158867 158868 158869 158870 158871 158872 158873 158874 158875 158876 158877 158878 158879 158880 158881 158882 158883 158884 158885 158886 158887 158888 158889 158890 158891 158892 158893 158894 158895 158896 158897 158898 158899 158900 158901 158902 158903 158904 158905 158906 158907 158908 158909 158910 158911 158912 158913 158914 158915 158916 158917 158918 158919 158920 158921 158922 158923 158924 158925 158926 158927 158928 158929 158930 158931 158932 158933 158934 158935 158936 158937 158938 158939 158940 158941 158942 158943 158944 158945 158946 158947 158948 158949 158950 158951 158952 158953 158954 158955 158956 158957 158958 158959 158960 158961 158962 158963 158964 158965 158966 158967 158968 158969 158970 158971 158972 158973 158974 158975 158976 158977 158978 158979 158980 158981 158982 158983 158984 158985 158986 158987 158988 158989 158990 158991 158992 158993 158994 158995 158996 158997 158998 158999 159000 159001 159002 159003 159004 159005 159006 159007 159008 159009 159010 159011 159012 159013 159014 159015 159016 159017 159018 159019 159020 159021 159022 159023 159024 159025 159026 159027 159028 159029 159030 159031 159032 159033 159034 159035 159036 159037 159038 159039 159040 159041 159042 159043 159044 159045 159046 159047 159048 159049 159050 159051 159052 159053 159054 159055 159056 159057 159058 159059 159060 159061 159062 159063 159064 159065 159066 159067 159068 159069 159070 159071 159072 159073 159074 159075 159076 159077 159078 159079 159080 159081 159082 159083 159084 159085 159086 159087 159088 159089 159090 159091 159092 159093 159094 159095 159096 159097 159098 159099 159100 159101 159102 159103 159104 159105 159106 159107 159108 159109 159110 159111 159112 159113 159114 159115 159116 159117 159118 159119 159120 159121 159122 159123 159124 159125 159126 159127 159128 159129 159130 159131 159132 159133 159134 159135 159136 159137 159138 159139 159140 159141 159142 159143 159144 159145 159146 159147 159148 159149 159150 159151 159152 159153 159154 159155 159156 159157 ..
One more error
There are 45967 messages in 2013. No message(s) currently marked as deleted in 2013 45967 others in 2013 Traceback (most recent call last): File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1022, in _command_complete typ, data = self._get_tagged_response(tag, expect_bye=logout) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1148, in _get_tagged_response self._get_response() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1050, in _get_response resp = self._get_line() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1160, in _get_line raise self.abort('socket error: EOF') imaplib.abort: socket error: EOF
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 489, in process for mnum, hinfo in get_msg_headers(server, msgnums[i: i + chunkSize]): File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 352, in get_msg_headers ms = check_response(server.fetch(message_ids_str, "(RFC822.HEADER)")) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 539, in fetch typ, dat = self._simple_command(name, message_set, message_parts) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1205, in _simple_command return self._command_complete(name, self._command(name, *args)) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1024, in _command_complete raise self.abort('command: %s => %s' % (name, val)) imaplib.abort: command: FETCH => socket error: EOF
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1022, in _command_complete typ, data = self._get_tagged_response(tag, expect_bye=logout) File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1148, in _get_tagged_response self._get_response() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1050, in _get_response resp = self._get_line() File "/var/packages/py3k/target/usr/local/lib/python3.8/imaplib.py", line 1160, in _get_line raise self.abort('socket error: EOF') imaplib.abort: socket error: EOF
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/volume1/IMAPdedup-master/imapdedup-20210504.py", line 597, in
Hi Tom -
Ah, yes, this (first error) cropped up 6 years ago (#19). The standard Python imaplib library that I use has a limit on the length of the response it expects back from the server. In those days, it was 10KB; I see they've now increased it to 1MB, but that's clearly not enough for some of us!
You could try increasing the limit by adding a line somewhere after import imaplib
which says, say,
imaplib._MAXLINE = 5000000
and see if that helps.
The second one may simply be your email server not responding very fast, and there's probably less we can do about that.
One thing you could try (if this is only an occasional operation for you) is to split your big folder up into several smaller ones, and then run
./imapdedup.py ... folder1 folder2 folder3 folder4 ...
and then recombine them, if wanted, once the duplicates have been removed?
Unfortunately, the problem persists after adding imaplib._MAXLINE = 5000000.
The script processes come folders with >200k messages well and then produces an error processing a folder with 45k messages. The Synology is pretty fast, so I doubt it's the email server response time. The error is in the same folder as last time.
There are 45967 messages in 2013. No message(s) currently marked as deleted in 2013 45967 others in 2013 Traceback (most recent call last): ......
Splitting into multiple folders isn't an option, unfortunately.
Mmm. Strange.
Not much I can do there, I'm afraid, if imaplib can't cope with it; your server must be sending surprisingly long lines!
You can try increasing _MAXLINE further. Otherwise, this might be a question for the imaplib maintainers.
The only real way to get IMAPdedup to look at a subset of the messages is by using the --sentbefore option to restrict them by date. If you have lots of duplicates, you may be able to eliminate them in chunks this way...?
Update: The imaplib._MAXLINE = 5000000 seems to have solved the problem for most folders but one. I suppose the fault in in this folder then and not in the script. Thank you very much! I suggest to add imaplib._MAXLINE = 5000000 as a standard.
I had the same issue, but using the below worked.
imaplib._MAXLINE = 10000000
5,000,000 wasn't enough but 10,000,000 was.
Thanks!