cpython icon indicating copy to clipboard operation
cpython copied to clipboard

Implementation of IMAP IDLE in imaplib?

Open 271e9923-fd98-4048-bbcb-a950400e630c opened this issue 14 years ago • 28 comments

BPO 11245
Nosy @warsaw, @ericvsmith, @bitdancer, @vadmium, @mitya57, @soltysh, @ankostis, @jdek
Dependencies
  • bpo-18921: In imaplib, cached capabilities may be out of date after login
  • Files
  • imapidle.patch
  • imapidle.patch: New patch
  • 0001-Add-IDLE-support-to-imaplib.patch: IMAP IDLE patch for imaplib
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2011-02-18.20:36:10.487>
    labels = ['type-feature', 'library', 'expert-email', '3.9']
    title = 'Implementation of IMAP IDLE in imaplib?'
    updated_at = <Date 2019-12-19.12:58:09.754>
    user = 'https://bugs.python.org/ShayRojansky'
    

    bugs.python.org fields:

    activity = <Date 2019-12-19.12:58:09.754>
    actor = 'jdek'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)', 'email']
    creation = <Date 2011-02-18.20:36:10.487>
    creator = 'Shay.Rojansky'
    dependencies = ['18921']
    files = ['27400', '37555', '48790']
    hgrepos = []
    issue_num = 11245
    keywords = ['patch']
    message_count = 24.0
    messages = ['128814', '129405', '129407', '171885', '171889', '172365', '172383', '202149', '202342', '202345', '233176', '235972', '236167', '245204', '245246', '245249', '245252', '245253', '245255', '245284', '246235', '246236', '293222', '358678']
    nosy_count = 17.0
    nosy_names = ['barry', 'pierslauder', 'eric.smith', 'piers', 'r.david.murray', 'Shay.Rojansky', 'martin.panter', 'mitya57', 'maciej.szulik', 'nafur', 'dveeden', 'Malina', 'F.Malina', 'ankostis', 'equaeghe', 'ohreally', 'jdek']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'test needed'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue11245'
    versions = ['Python 3.9']
    

    IMAP IDLE support is not implemented in the current imaplib. A "drop-in" replacement called imaplib2 exists (), but uses internally managed threads - a heavy solution that is not always appropriate (e.g. when handling many IMAP accounts an asynchronous approach would be more efficient)

    I am about to start implementation of an asynchronous select()-compatible approach, and was wondering if there has been any discussion over IDLE, any specific reasons it hasn't been implemented and if eventual integration into imaplib would be a desirable thing.

    Proposed approach:

    • Addition of a new state 'IDLE'
    • Addition of an idle() method to class IMAP4, which issues the IDLE command to the server and returns immediately. At this point we enter the IDLE state, in which no normal IMAP commands may be issued.
    • Users can now select() or poll() the socket as they wish
    • A method can be called to retrieve any untagged responses (e.g. EXISTS) that have arrived since entering the IDLE state. The function returns immediately and does not modify the state.
    • To end the IDLE state, the user calls a method (done()?) which resumes the previous state.

    Would appreciate any sort of feedback...

    imaplib has no particular maintainer and I know little about it. Doc says it implements 'a large subset of the IMAP4rev1 client protocol as defined in RFC 2060." I do not remember any discussion on pydev, over the last several years, about imaplib. I presume just the subset was chosen because of some combination of necessity and feasibility, as judged by the implementors. Hence the complement, the unimplemented subset, would be 'not done' rather than 'not wanted'. If your proposed new feature, an IDLE command, is part of this complement, then I would assume that a patch would, in principle, be acceptable.

    I cannot comment on your particular proposal, but I hope the above helps as far as it goes.

    terryjreedy avatar Feb 25 '11 19:02 terryjreedy

    I just wound up doing a bit of research on this for other reasons. Piers Lauder was the original author of the imaplib module, and he is (as far as I can tell) currently maintaining an imaplib2 module that does support IDLE (but not, I think, python3). But it does IDLE (and other things) via threads, and in the email I found announcing it he didn't think it was suitable for stdlib inclusion (because of the threading). Piers hasn't contributed to core in quite a while as far as I can tell, but he was active in a bug report back in 2008 according to google, so I thought I'd add him to nosy and see if he has time for an opinion.

    bitdancer avatar Feb 25 '11 19:02 bitdancer

    We have implemented this functionality according to RFC 2177. We actually implemented a synchronous idle function that blocks until a timeout occurs or the server sent some event.

    This is not the most flexible way, however it will provide a basic functionality that enables user to use imap idle based notifications. Besides, every other solution would require threads or regular polling.

    See attached patch file.

    To fully answer the original question that opened this issue: contributions will be welcomed. While I don't currently have time to work on imaplib myself, I have an interest and will review and if appropriate commit patches.

    I like Shay's proposal, but absent a patch along those lines having blocking IMAP support will definitely be an improvement. An application needing to monitor more than one imap connection could do its own threading.

    Thanks for proposing the patch. Could you please submit a contributor agreement? I probably won't have time to fully consider the proposed patch for a bit, but I've put it on my todo list.

    test_imaplib does have a testing framework now, do you think you could write tests for the new feature?

    bitdancer avatar Oct 03 '12 14:10 bitdancer

    I got the confirmation for my agreement.

    I'm not quite sure about the tests, as I'm not really familiar with the way this is done in cpython. The test_imaplib.py seems to cover all ways to connect to some server, but none of the actual imap commands. The patch only implements another commands (whose behaviour is highly/only dependent of other events on the server).

    Hence, I don't see a way to create a meaningfull test case other than just calling the command...

    Yeah writing a good test case for this is a bit tricky, since we'll need some infrastructure (an Event?) so we can prove that the call has blocked without delaying the test for very long.

    I'm not sure when I'll be able to get to this...I'm not going to check it in without a test, so I'll have to find enough time to be able to write the tests.

    bitdancer avatar Oct 08 '12 15:10 bitdancer

    I stumbled about this issue again and would really like to see it fixed.

    I see the possibility to create a test case in combination with the first test sequence which creates a temporary mail. Would it be enough, that we just call IDLE in some folder, create a temporary mail in this folder and check if it returns?

    Unfortuantely, I have not been able to write code for such a test case yet, as the whole test routine fails with "[PRIVACYREQUIRED] Plaintext authentication disallowed on non-secure (SSL/TLS) connections". This is using 3.2.3, but I guess it will not be any different with the current release... (as it is the same with 2.7.3)

    What do you mean by the whole test routine failing? The test suite is currently passing on the buildbots, so are you speaking of the new test you are trying to write?

    bitdancer avatar Nov 07 '13 15:11 bitdancer

    Hmm. Looking at this again, it appears as though there's no way to interrupt IDLE if you want to, say, send an email. If you are actually using this in code, how are you handling that situation?

    bitdancer avatar Nov 07 '13 15:11 bitdancer

    So, let's resurrect this one.

    For the project that lead to the old patch, we did not need this feature. However, we now needed are more complete implementation of IDLE. Hence, we extended this to return after sending idle() and support polling, leaving idle mode or wait until something happens (like before).

    IMAP polling hurts, just merge imaplib2 into standard library as imaplib.

    Piers Lauder authored imaplib IMAP4 client, part of python standard library, back in December 1997 based on RFC 2060. In 2003 RFC 2060 was made obsolete by RFC 3501 adding important features and Piers released imaplib2 which receives feature updates since. Last feature updates to the standard library imaplib were before Piers retired from Sydney University a decade ago.

    imaplib2 presents an almost identical API as that provided by the standard library imaplib, the main difference being that imaplib2 allows parallel execution of commands on the IMAP4 server, and implements the IDLE extension, so NO POLLING IS REQUIRED. IMAP server will push new mail notifications to the client. Imaplib2 also supports COMPRESS, ID, better timeout handling etc. There is 975 more lines of code all doing useful things a modern IMAP client needs.

    imaplib2 can be substituted for imaplib in existing clients with no changes in the code apart from required logout call to shutdown the threads.

    Old imaplib was ported to Python 3 with the rest of the standard library. I am working to port imaplib2 to py3, stuck on receiving bytes v strings.

    References:

    imaplib2 code and docs http://sourceforge.net/p/imaplib2/code/ci/master/tree/ also http://sydney.edu.au/engineering/it/~piers/python/imaplib2.html

    imaplib https://hg.python.org/cpython/file/3.4/Lib/imaplib.py

    Ruby stdlib support for idle (not that it hurts python performance, just my pride) http://ruby-doc.org/stdlib-2.0.0/libdoc/net/imap/rdoc/Net/IMAP.html#method-i-idle

    Imaplib2 now supports Python 3. Piers and me propose to merge imaplib2 into standard library as imaplib.

    Excerpt from our conversation:

    Piers: ...Thanks for bringing it (this thread) to my attention. I entirely agree with your comments.

    Me: ...I found the criticism of the "threads - a heavy solution"? counterproductive. Not that I know anything about threads...

    Piers: I'm not sure what the whole anti-threads thing was about all those years ago since I always loved using them. Maybe early implementations were slow, or, more likely, early adopters were clumsy ("giving threads to a novice is like giving a blow torch to a baby" to paraphrade a quote :-)

    Are you volunteering to be maintainer, and/or is Piers? If he's changed his mind about the threading, that's good enough for me (and by now he has a lot more experience with the library in actual use).

    The biggest barrier to inclusion, IMO, is tests and backward compatibility. There have been enough changes that making sure we don't break backward compatibility will be important, and almost certainly requires more tests than we have now. Does imaplib2 have a test suite?

    We would need to get approval from python-dev, though. We have ongoing problems with packages that are maintained outside the stdlib...but updating to imaplib2 may be better than leaving it without a maintainer at all.

    Can we get Piers involved in this conversation directly?

    bitdancer avatar Jun 12 '15 02:06 bitdancer

    I am in for my part and I emailed Piers to come and join us and he surely will when the bug tracker is responsive again.

    Imaplib2 does have an up to date test suite and compatibility wise imaplib2 can be substituted for imaplib in existing clients with no changes in the code.

    On top of that I have a private IDLE test suite for common tasks such as

    • instant bounce processing,
    • email photo upload to web service,
    • unsubscribe processing via list-unsubscribe headers and feedback loops of major email providers.

    I am looking to make it part of an external project https://github.com/fmalina/emails, but need to extract much of the recipes first out of a working application in a reusable manner as I need it in other projects and will do for years to come.

    This is great.

    When you say it is fully compatible, though, is that testing against imaplib in python2 or python3? It is the python3 decisions about string/bytes handling where the discrepancies are most likely to arise, unless the python3 port was modeled on the stdlib version.

    bitdancer avatar Jun 12 '15 13:06 bitdancer

    I just wen’t through my repo looking at relevant commits to double check and I didn’t have to change a line in my user level code when upgrading from python2 to 3. There was only one way to do it.

    Do you have any tests that use non-ascii passwords? I think that was the most significant bug.

    bitdancer avatar Jun 12 '15 13:06 bitdancer

    I don’t have a test for it, neither has stdlib imaplib. We just need to port over the encode fix.

    Copy over the fixed version of _CRAM_MD5_AUTH. from line 599 in python3.5 imaplib https://github.com/python/cpython/blob/master/Lib/imaplib.py#L599 <https://github.com/python/cpython/blob/master/Lib/imaplib.py#L599> corresponding to line 884 in imaplib2 https://github.com/bcoe/imaplib2/blob/master/imaplib2/imaplib2.py#L884 <https://github.com/bcoe/imaplib2/blob/master/imaplib2/imaplib2.py#L884>

    Hi, apologies for not responding to the "pierslauder" pings, but i don't own that login, or at least have forgotten all about it, and its email address is invalid (or there is another pierslauder out there).

    I maintain imaplib2 on sourceforge (as piersrlauder) at https://sourceforge.net/projects/imaplib2/ and that version has just been modified to incorporate the CRAM_MD5_AUTH change from python3.6. It is regularly updated with bug fixes and it also has built-in tests for the IDLE function.

    I originally intended for imaplib2 to be incorporated into pythonlib, leaving the original module in place (a la urllib/2). Then people wouldn't be forced into a switch using threads except by choice.

    Anyway, happy to help.

    Thanks, Piers!

    Sorry for dropping off the map on this, I've been busy.

    I'll post to python-dev about this and see how the community would like to proceed.

    bitdancer avatar Jul 04 '15 00:07 bitdancer

    By the way, the pierslauder id points to '[email protected]'.

    bitdancer avatar Jul 04 '15 01:07 bitdancer

    Before merging imaplib2 please consider making proper use of the Python's standard logging module.

    Hi, I'm new to python but I had a go at implementing this for imaplib(1) using a different approach. It works but it has a couple issues (see patch), I would appreciate any thoughts/improvements.

    FYI,

    Here is a bare-minimum version that Works For Me (so far) with python 3.9 and dovecot 1.2.9 (don't ask). This is meant to replace a crappy cron job * * * * * nobody curl https://⋯/receiveEmail.php >/dev/null. I'm think sure this will eventually OOM, as I never explicitly reap the "still here" continuation responses.

    #!/usr/bin/python3 
    
    import imaplib
    import logging
    
    
    # Enable IDLE support. 
    if 'IDLE' in imaplib.Commands:
    
        # in the unlikely event this feature is fixed upstream... 
        IMAP4_SSL = imaplib.IMAP4_SSL
    
    else:
    
        class IMAP4_SSL_plus_IDLE(imaplib.IMAP4_SSL):
    
            def idle(self):
                if 'IDLE' not in self.capabilities:
                    raise self.error('Server does not support IDLE')
                idle_tag = self._command('IDLE')  # start idling 
                self._get_response()
                while line := self._get_line():
                    if line.endswith(b'EXISTS'):
                        self.send(b'DONE' + imaplib.CRLF)
                        return self._command_complete('IDLE', idle_tag)
    
        imaplib.Commands['IDLE'] = ('AUTH', 'SELECTED')
        IMAP4_SSL = IMAP4_SSL_plus_IDLE
    
    
    with IMAP4_SSL(⋯) as conn:
        conn.login(user=⋯, password=⋯)
        conn.select(mailbox='INBOX', readonly=True)
        conn.debug = 100            # DEBUGGING 
        while True:
            resp = requests.get('https://⋯/receiveEmail.php')
            resp.raise_for_status()
            logging.debug('HTTP GET said %s', resp.text)
            resp = conn.idle()
            logging.debug('IMAP IDLE said %s', resp)
    

    trentbuck avatar Nov 07 '22 07:11 trentbuck

    Looking at this again, it appears as though there's no way to interrupt IDLE if you want to, say, send an email.

    @bitdancer I wonder if this is a realistic use case. Does imaplib support sending? Does the protocol? The RFCs say no:

    rfc3501: "IMAP4rev1 does not specify a means of posting mail; this function is handled by a mail transfer protocol such as RFC 2821."

    rfc9051: "IMAP4rev2 does not specify a means of posting mail; this function is handled by a mail submission protocol such as the one specified in RFC 6409."

    Given that limitation inherent to the protocol, I would think it sufficient for an idle() method to simply block, until one of these things occurs:

    1. A callback for receiving untagged data says "that's enough; we're done idling."
    2. A caller-supplied idle timeout expires.

    I think repeatedly long-polling with a timeout (case 2) is a pretty common pattern in systems programming and communication web services, so probably familiar to a lot of programmers, and easy enough for calling code to implement. Clients would likely have to do this anyway, in order to avoid being disconnected by a server with an autologout timer. (The RFCs suggest a 29 minute IDLE for this reason.)

    One nice thing about this approach is that it gets the job done without departing from the library's existing single-threaded design. This keeps the API semantics consistent, and avoids introducing the complexities of multi-threading into programs that use it.

    I spent today working on an implementation. At roughly 50 lines of new code plus comments and doc strings, it's already functional. It adheres to the spec in areas where other attempts I've seen do not. (Notably, it avoids inventing new states or commands, avoids assumptions about what IDLE events the server will push, and avoids losing previously collected data.) It uses imaplib's existing machinery to do all the parsing and I/O, so new tests for those things would presumably not be needed.

    Other than logging, the main thing left to add is the timeout. I plan to use select() for the common cases: socket-based connections on all platforms, and stdin/stdout pipes on unix. For the special case of stdin/stdout on Windows, I plan to let the timeout be delayed until the next untagged response arrives (because select() doesn't work on Windows pipes) or perhaps just let the timeout be disabled in that case. Either way, it can be documented with an OS availability note, much like various other parts of the Python stdlib.

    foresto avatar Jul 01 '24 08:07 foresto

    Digging deeper into this reveals that supporting IDLE with a naïve timeout implementation is doomed to work poorly, because imaplib creates its file-like objects in buffered mode. That's a problem for any timeout based on select() or poll(), because those system calls can't see already-buffered data, leading them to block even when data is ready for reading.

    I see two possible ways to deal with this, neither of which is trivial:


    1. Stop using the standard library's buffered file objects, override read() and readline() with custom implementations that expose their internal buffer, and check that buffer before calling select() for an idle timeout. The file object setup code would be slightly different for the socket-based IMAP4 classes vs. IMAP4_stream, but the key methods could be shared.

    The risk I see with approach 1 is that, since imaplib named its file objects with no leading underscore, there might be client code in the wild that uses them directly and depends on their original (buffered) behavior. If such code exists, it would likely break.

    Breakage could be avoided by subclassing IMAP4, IMAP4_SSL, and IMAP4_stream, and implementing IDLE only in the new subclasses.

    Alternatively, a custom file-like class implementing all 8 read and write methods from io.BufferedReader, and exposing its internal buffer as needed for select(), could be written and used instead of the one from the standard library. This should work even with client code that uses imaplib's file objects directly, so long as it doesn't check the types of those objects with isinstance() et al.

    (Then again, perhaps it's okay to break attributes like IMAP4.file and IMAP4_stream.readfile, given that they do not appear in the documentation?)


    1. Put the file object in nonblocking mode and/or set a socket timeout during IDLE. Handle the resulting exceptions from IMAP4._get_line(), try to always completely drain the buffer on every read, and use select() for timeouts despite having no visibility into the buffer.

    Note that having to drain the buffer on every read would preclude a "get one response during IDLE" method; users would have to be content with receiving multiple responses per method call, unless another buffer was added to dole them out one at a time.

    The risk I see with approach 2 is that the file object comes from socket.makefile(), which forbids nonblocking mode and strongly discourages socket timeouts, according to its documentation:

    "The socket must be in blocking mode; it can have a timeout, but the file object’s internal buffer may end up in an inconsistent state if a timeout occurs."

    That suggests to me that, although it might work in current CPython versions on popular platforms, it might eventually fail on other python implementations, versions, or operating systems. And since the internal buffer is affected, the failure could mean silent data loss or corruption.


    Approach 2 is what imapclient does.

    Approach 1 (using subclasses) is what I did in my own code.

    foresto avatar Jul 15 '24 00:07 foresto

    My first implementation used callbacks to handle idle responses, but it felt a little awkward, so I switched to an iterable context manager instead. Using it goes something like this:

    
    with imap.idle(duration=29*60) as idler:
        for response in idler:
            typ, data = response
            print(typ, data)
    

    The IDLE command is sent upon entering the context, and DONE is sent at exit. Untagged responses arriving with the server's continuation request are queued for delivery by the iterator.

    The optional duration argument limits the idle duration, for example to avoid a server-imposed inactivity timeout, or to make sure an exception is eventually raised if the network disappears during idle.

    To get a single response from an idle session:

    
    with imap.idle() as idler:
        typ, data = next(idler)
    
    

    Any received leftovers are appended to untagged_responses on context exit, so they can be collected in the usual imaplib way.

    The iterator has a (generator) method to get the next burst of responses, such as a rapid-fire series of EXPUNGE after a bulk delete:

    
    with imap.idle() as idler:
    
        # get the next response and any others following by < 0.1 second
        batch = list(idler.burst(interval=0.1))
    
        print(f'processing {len(batch)} responses...')
        for typ, data in batch:
            print(typ, data)
    

    The burst method also works within iteration loops, and respects duration:

    
    with imap.idle(duration=29*60) as idler:
    
        for typ, data in idler:
    
            print('got a single response:', typ, data)
    
            batch = list(idler.burst())
            print(f'also got a burst of {len(batch)} more responses')
    

    To spend 29 minutes processing response bursts:

    
    with imap.idle(duration=29*60) as idler:
    
        while batch := list(idler.burst()):
    
            print(f'processing {len(batch)} responses...')
            for typ, data in batch:
                print(typ, data)
    

    ...Or just use it the simple way:

    
    with imap.idle() as idler:
        for response in idler:
            print(response)
    

    foresto avatar Jul 15 '24 00:07 foresto

    Discussion started here:

    https://discuss.python.org/t/gauging-interest-in-my-imap4-idle-implementation-for-imaplib/59272

    foresto avatar Jul 26 '24 21:07 foresto

    I am sorry, but I am not sure the interface proposed in #122542 is the best possible interface. There are some issues with it:

    • Using the iterator protocol means that the polling is blocking. When we are blocked on waiting a response from the server, there is no way to interrupt waiting. It means that an interactive application which displays an IMAP4 folder status at runtime cannot be responsible to user input. We need a non-blocking operation to poll a response if there is one and return immediately if there are no responses. We need also a blocking operation with timeout and a way to interrupt thread before expiration of the timeout.
    • Using the context manager protocol means that DONE is only sent after leaving the with block. After client sent DONE, server can still send status updates. What happens to them? You cannot use Idler after leaving the with block. We need a method to send DONE and read status updates that came after this.
    • Server can send status updates not only for IDLE command, but for other commands. It is common to use the NOOP command for this. There is a large similarity between NOOP and IDLE commands, but is is not seen in the proposed interface. You can read status updates from untagged_responces after the NOOP command, although this is not thread-safe. It would be nice to have unified way to get status updates for IDLE and other commands. It would be nice if it is thread-safe or there is a clear way to use the IMAP4 client in thread-safe way.

    Let's look what other solutions provide. Interesting, there were two different third-party packages named imaplib2, both were proposed for inclusion in the stdlib.

    Piers Lauder's imaplib2 was discussed above. It is available on PyPI. I has a simple blocking idle() method which waits for first server response or expiration of the timeout. Then it send DONE and wait for tagged response for the IDLE command. You can read status updates from untagged_responces after this. It may be not very efficient, because you need to send IDLE again, but it fits perfectly in the line with other commands. This package is compatible with imaplib. The main disadvantage of this package is that it runs internal threads. But I think that there may be possibility to implement it without running internal thread. Instead the user will need to run external threads to utilize the IMAP4 concurrency features. imaplib will still need to use locks and events to make its methods thread-safe.

    Maxim Khitrov's imaplib2 was discussed in Python-ideas. It provides incompatible interface, in particularly concurrent fetch(). Its interface for IDLE is similar to the one proposed in #122542, with support for the context manager and iterator protocols, but it also has the poll() method.

    Looking at support in other programming languages, in Perl, support for IDLE is provided by three methods -- idle, idle_data and done. idle is non-blocking, it simply sends the IDLE command, idle_data is non-blocking or blocking with timeout, it polls server response. done sends DONE.

    serhiy-storchaka avatar Dec 17 '24 10:12 serhiy-storchaka