imapclient icon indicating copy to clipboard operation
imapclient copied to clipboard

`do_list` crashes when encountering a mailbox containing a `[` (but no closing bracket)

Open jap opened this issue 1 year ago • 1 comments

There's this situation where the parser in imapclient crashes on the response of a LIST command (from dovecot) when there is a mailbox with a name contains an (unclosed) opening square bracket.

I stared at the code but could not find an easy fix :(

This test case (added to TestParseResponse) shows what I was hoping to achieve:

     def test_unclosed_opening_square_bracket(self):
        self._test(b'(foo) "." bar[baz', ((b"foo",), b".", b"bar[baz"), wrap=False)

This test case was inspired by debug output of our own webmail application tests, where such a folder gets created, and then the following untagged responses are returned from dovecot:

DEBUG    imapclient.imaplib:imaplib.py:1241 untagged_responses[LIST] => [b'(\\HasNoChildren) "." 0255021', b'(\\HasNoChildren \\UnMarked) "." 1', b'(\\HasChildren) "." b58562b335b4646084e1', b'(\\HasNoChildren \\UnMarked) "." b58562b335b4646084e1.[b49c20cfb7d8304af5e4' [...]]

This test fails with the following traceback:

Traceback (most recent call last):
  File "/Users/spaans/xsrc/imapclient/imapclient/response_parser.py", line 94, in gen_parsed_response
    for token in src:
  File "/Users/spaans/xsrc/imapclient/imapclient/response_lexer.py", line 122, in __iter__
    for tok in self.read_token_stream(iter(source)):
  File "/Users/spaans/xsrc/imapclient/imapclient/response_lexer.py", line 99, in read_token_stream
    token.extend(read_until(stream_i, CLOSE_SQUARE, escape=False))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spaans/xsrc/imapclient/imapclient/response_lexer.py", line 74, in read_until
    raise ValueError("No closing '%s'" % chr(end_char))
ValueError: No closing ']'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/spaans/xsrc/imapclient/tests/test_response_parser.py", line 174, in test_unclosed_opening_square_bracket
    self._test(b'(foo) "." bar[baz', ((b"foo",), b".", b"bar[baz"), wrap=False)
  File "/Users/spaans/xsrc/imapclient/tests/test_response_parser.py", line 193, in _test
    output = parse_response(to_parse)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spaans/xsrc/imapclient/imapclient/response_parser.py", line 36, in parse_response
    return tuple(gen_parsed_response(data))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/spaans/xsrc/imapclient/imapclient/response_parser.py", line 100, in gen_parsed_response
    raise ProtocolError("%s: %r" % (str(err), token))
imapclient.exceptions.ProtocolError: No closing ']': b'"."'

(I cross referenced this against the grammar in rfc3501 and bar[baz is a valid astring, which in turn is allowed in the mailbox part of LIST responses)

I'm now contemplating implementing a custom parser for LIST responses; would that be an acceptable solution?

jap avatar Feb 26 '24 14:02 jap

Hmm tricky. Right now the lexer is looking for the matching ] which it probably shouldn't be. That should be the parser's responsibility. The lexer at most return tokens for [ and ] but the parser should handle matching them.

I'm not adverse to a separate parser for LIST responses. We already have to do some special post processing in _proc_folder_list for numbers which shouldn't really be there.

If you are able to have a go at this that would be great.

I may also have some time next week to look into this more deeply.

mjs avatar Mar 10 '24 18:03 mjs