neomutt icon indicating copy to clipboard operation
neomutt copied to clipboard

Heap overflow in maildir/shared.c:maildir_move_to_mailbox

Open vuori opened this issue 3 years ago • 2 comments

Expected Behaviour

Switching Maildir mailboxes shouldn't crash.

Actual Behaviour

Switching Maildir mailboxes crashes with ASan and sometimes without it (without a crash there may be silent memory corruption), because mutt_mailbox_check has caused Mailbox.msg_count to become invalid while mx_mbox_open was running.

Steps to Reproduce

I haven't managed to deterministically reproduce this, but the required conditions are something like:

  1. mail_check_stats is enabled.
  2. Status line contains an expando that depends on stats check result (such as %b).
  3. A stats check is pending when user switches to a new mailbox (such as by invoking next-unread-mailbox).
  4. The target mailbox is a Maildir which has received a largish number of new mails since it was last opened (if more than 25, the issue should always occur).

How often does this happen?

Once or twice a week.

When did it start to happen?

A long time ago (over a year probably).

NeoMutt Version

Near-latest master.

Extra Info

The underlying issue is that most code expects m->msg_count to reflect the number of mails that Neomutt has read, viz. the number of valid entries in m->emails. In particular, maildir_move_to_mailbox expects m->emails[m->msg_count] to be the first unused slot in m->emails (or the point after the end of the array).

Unfortunately, mailbox stats checking functions like maildir_mbox_check_stats violate this invariant by setting m->msg_count to the number of mails in the file system, which is often different from the number of mails that have been read into m->emails.

This will usually not cause problems, since mutt_mailbox.c:mailbox_check will prevent stats checking from running on the currently-open mailbox, and mx_open_mailbox will always clear the to-be-opened mailbox's msg_count and emails array. However, if stats checking ever happens on an open mailbox, things are bound to break.

What happens here (code has been instrumented with extra assertions to catch this situation):

[assert()-related frames removed]
#4  0x000055665efcaa66 in maildir_mbox_check_stats (m=0x611000018500, flags=0 '\000') at maildir/maildir.c:1357
#5  0x000055665ee10b27 in mx_mbox_check_stats (m=0x611000018500, flags=0 '\000') at mx.c:1809
#6  0x000055665edf7c11 in mailbox_check (m_cur=0x61100000a400, m_check=0x611000018500, st_ctx=0x7ffe4f135b50, flags=0 '\000')
    at mutt_mailbox.c:134
#7  0x000055665edf8542 in mutt_mailbox_check (m_cur=0x61100000a400, flags=0 '\000') at mutt_mailbox.c:215
#8  0x000055665ee253df in status_format_str
    (buf=0x7ffe4f136370 "", buflen=1024, col=65, cols=109, op=98 'b', src=0x7ffe4f136866 "%?l? %l?]---%?V?(Lim:%V)?-%>-(%P)---", prec=0x7ffe4f136190 "", if_str=0x7ffe4f136230 " Inc:%b", else_str=0x7ffe4f1362d0 "", data=140730225095088, flags=4 '\004') at status.c:119
#9  0x000055665eded403 in mutt_expando_format
    (buf=0x7ffe4f137290 "---NeoMutt: ~/Maildir/.lists.emacs-devel [Msgs:4301 New:69", buflen=1023, col=65, cols=109, src=0x7ffe4f136866 "%?l? %l?]---%?V?(Lim:%V)?-%>-(%P)---", callback=0x55665ee25160 <status_format_str>, data=140730225095088, flags=4 '\004')
    at muttlib.c:1221
#10 0x000055665ee2705e in menu_status_line
    (buf=0x7ffe4f137290 "---NeoMutt: ~/Maildir/.lists.emacs-devel [Msgs:4301 New:69", buflen=1024, shared=0x60600003aca0, menu=0x60c000003b80, cols=109, fmt=0x60e0000011c0 "-%r-NeoMutt: %D [Msgs:%?M?%M/?%m%?n? New:%n?%?o? Old:%o?%?d? Del:%d?%?F? Flag:%F?%?t? Tag:%t?%?p? Post:%p?%?b? Inc:%b?%?l? %l?]---%?V?(Lim:%V)?-%>-(%P)---") at status.c:450
#11 0x000055665ee4fdb9 in ibar_recalc (win=0x60e000002880) at index/ibar.c:98
#12 0x000055665ef31a64 in window_recalc (win=0x60e000002880) at gui/mutt_window.c:568
#13 0x000055665ef31af2 in window_recalc (win=0x60e0000026c0) at gui/mutt_window.c:574
#14 0x000055665ef31af2 in window_recalc (win=0x60e000002c00) at gui/mutt_window.c:574
#15 0x000055665ef31af2 in window_recalc (win=0x60e0000025e0) at gui/mutt_window.c:574
#16 0x000055665ef31af2 in window_recalc (win=0x60e000000740) at gui/mutt_window.c:574
#17 0x000055665ef31af2 in window_recalc (win=0x60e000000580) at gui/mutt_window.c:574
#18 0x000055665ef31db2 in window_redraw (win=0x60e000000580) at gui/mutt_window.c:619
#19 0x000055665ef2dfbe in msgcont_push_window (win=0x60e000005c20) at gui/msgcont.c:98
#20 0x000055665eea497e in progress_new
    (msg=0x7ffe4f1379c0 "Scanning /home/vuori/Maildir/.lists...", type=MUTT_PROGRESS_READ, size=0) at progress/progress.c:135
#21 0x000055665efc648d in maildir_read_dir (m=0x61100000b800, subdir=0x55665f19a240 "new") at maildir/maildir.c:684
#22 0x000055665efc8b99 in maildir_mbox_open (m=0x61100000b800) at maildir/maildir.c:1102
#23 0x000055665ee06d72 in mx_mbox_open (m=0x61100000b800, flags=0 '\000') at mx.c:391
#24 0x000055665ee31646 in change_folder_mailbox
    (menu=0x60c000003b80, m=0x61100000b800, oldcount=0x6040000159d4, shared=0x60600003aca0, read_only=false) at index/dlg_index.c:701
#25 0x000055665ee3fd44 in op_main_next_unread_mailbox (shared=0x60600003aca0, priv=0x6040000159d0, op=166) at index/functions.c:1425
#26 0x000055665ee49ed6 in index_function_dispatcher (win=0x60e0000027a0, op=166) at index/functions.c:2860
[…]

So during mx_mbox_open, a status line redraw prompted by the progress viewer is causing a stats check. This may set m->msg_count to a value that is larger than current number of valid entries in m->emails. If execution were to continue from this point, maildir_move_to_mailbox will start inserting mails at some point beyond the number of currently valid entries; depending on the alignment of stars, this may also be beyond the number of entries currently allocated to m->emails, in which case it will overrun the heap.

Proposed fix

The situation is tricky, but there are multiple avenues that come to mind:

  1. The hard but correct fix: The big issue is breaking the invariant on the meaning of msg_count by stats checking. Ideally, "number of emails we have read" (value most of the code cares about) and "number of emails the backend thinks exist" (value the user cares about viz. the display count) would live in separate struct members. This looks like an annoying but mostly mechanical refactoring, and IMO should happen eventually. (I can probably work on this if needed.)
  2. The potentially otherwise useful fix of unknown difficulty: Perhaps progress_new shouldn't prompt status line redraws; this would avoid the check_stats call and probably speed up execution in general. I haven't looked into what this would involve.
  3. The quick fix: stash the mailbox being opened into a global variable while mx_mbox_open is running and prevent mutt_mailbox_check from probing that mailbox. Or maybe stash a boolean like kInMboxOpen and if it's true, skip checking completely. Or something else with similar effect.

vuori avatar Mar 12 '23 13:03 vuori

This will usually not cause problems, since mutt_mailbox.c:mailbox_check will prevent stats checking from running on the currently-open mailbox

Thanks very much for the analysis, and especially for this sentence: I think it is enlightening.

This tells me that it is (or, was) a known fact that checks should not be ran on an open mailbox, exactly because that might impact msg_count. When a mailbox is open, we expect NeoMutt to have a coherent view on it: if NeoMutt knows it has M messages, then it has M messages loaded.

To me, it looks like the correct fix would be to make sure the code knows early enough we're opening mailbox X, so mailbox_check knows it mustn't run stats on it.

gahr avatar Mar 13 '23 08:03 gahr

Yes, you could say that at some level the problem is also that the shared state (including "current mailbox" state) does not exactly follow along with the close-to-open flow. That is, since the shared state update happens at higher levels, it will currently be wrong during half of the low-level close+sync→open+parse processing.

Whether the fix is more accurate shared state bookkeeping wrt. current mailbox or just an extra kludge flag that prevents dangerous things during mailbox switching is the question I guess. Former is nicer, but latter is less work and less risk.

vuori avatar Mar 13 '23 11:03 vuori