alot icon indicating copy to clipboard operation
alot copied to clipboard

Startup very slow on master (due to "lazy thread lookup" fix)

Open lfos opened this issue 1 year ago • 8 comments

Describe the bug Running alot from the master branch, startup is extremely slow (almost one minute, compared to <1 second with 0.10). The culprit seems to be commit caf87d96, based on a git-bisect(1) run. Reverting that commit on master brings the startup time back to 0.10 behavior. Note that my inbox has around 500k messages.

Software Versions

  • Python version: 3.11.8
  • Alot version: c1137ea9 (bump gpg dependency, 2024-02-12)

lfos avatar Feb 17 '24 03:02 lfos

Since 0.11 has recently been released, I went back to mainline and noticed this is still a major issue. Without locally reverting caf87d9, alot is too slow for everyday use for me, despite using a faily decent system with SSD.

@pazz Given the significant performance impact, should we revert the commit? Any alternatives we can explore?

lfos avatar Sep 01 '24 17:09 lfos

@lucc do you have an opinion? I don't mind reverting but cannot recall the history or having done any deep analysis on this. that hotfix I merged does look quite intrusive

pazz avatar Sep 03 '24 13:09 pazz

I do not remember much from when the original fix was merged. Looking at the code it seems that the generator that produces the thread ids might produce an invalid thread id if the underlying message gets deleted (from the buffer or the db by the way?).

Right about now I would be happy to have a test case for the original issue so that we could test alternative solutions. Could we somehow validate the thread/id before yielding it from the generator? That way we could benefit from the lazyness but not be bitten by invalid pointer.

lucc avatar Sep 03 '24 15:09 lucc

Some background and explanations on how to reproduce can be found in issue #1460. Based on others' comments there, commit caf87d9 seems to have been introduced as a workaround for a notmuch bug, and it is unclear whether it is still needed. FWIW, in order to be able to use alot, I have reverted caf87d9 locally, and I can't reproduce the crash when following the steps outlined in the original bug report #1460.

lfos avatar Sep 06 '24 16:09 lfos

FWIW, I've been using this patch for a while -- essentially a mix between old and new behavior:

diff --git a/alot/db/manager.py b/alot/db/manager.py
index 39c34e2e..09c537ab 100644
--- a/alot/db/manager.py
+++ b/alot/db/manager.py
@@ -4,6 +4,7 @@
 # For further details see the COPYING file
 from collections import deque
 import contextlib
+import itertools
 import logging
 
 from notmuch2 import Database, NotmuchError, XapianError
@@ -344,11 +345,18 @@ class DBManager:
         assert sort in self._sort_orders
         db = Database(path=self.path, mode=Database.MODE.READ_ONLY,
                       config=self.config)
-        thread_ids = [t.threadid for t in db.threads(querystring,
-                            sort=self._sort_orders[sort],
-                            exclude_tags=self.exclude_tags)]
-        for t in thread_ids:
-            yield t
+
+        # retrieve and store the first 1000 thread IDs
+        thread_iterator = db.threads(querystring,
+                                     sort=self._sort_orders[sort],
+                                     exclude_tags=self.exclude_tags)
+        thread_ids = [t.threadid for t in
+                      itertools.islice(thread_iterator, 1000)]
+
+        # yield the stored thread IDs first, then proceed with iterator
+        yield from thread_ids
+        for t in thread_iterator:
+            yield t.threadid
 
     def add_message(self, path, tags=None, afterwards=None):
         """

While definitely still a workaround, it at least makes alot usable again for me (load times are acceptable again and no crashes so far).

lfos avatar Jan 31 '25 21:01 lfos

Today I started encountering complete freezes (for at least minutes) when opening any thread with more than one message. I found this thread, applied the path of @lfos and that seems to work. What would be the proper fix for this issue? I cannot promise anything, but I will see if I can invest some time to make a PR. Also, this started happening out-of-the-blue for me, I updated my Debian earlier today, and the only related update would have been mailcap related.

saemideluxe avatar Mar 12 '25 09:03 saemideluxe

Oh, never mind the "freeze" is still happening, for threads with a .csv attachment. But in any case, the suggested patch definitely improves startup performance. (Edit: Actual problem was me using 'needsterminal' in .mailcap.)

saemideluxe avatar Mar 12 '25 09:03 saemideluxe

I updated alot to 0.11 and I can confirm that it's very slow.

Perhaps alot could make lazy search configurable (non-lazy by default), that is, provide a way to disable https://github.com/pazz/alot/commit/caf87d960d8842bdc66e5aae8e36b598285de48c. It's great that alot help users to workaround problems related projects by prividing safe defaults, but some users were less affected by https://github.com/pazz/alot/issues/1460 depending on notmuch vs notmuch2 bindings, database settings, etc. When going fully lazy, alot could try to catch notmuch exceptions and replace the broken connection with a new one.

I'm clueless about non-sql databases, but to me notmuch bindings seem quite limited. If notmuch2 implemented limit configurations one could:

  • Limit notmuch searches in alot (e.g. search --limit=100 tag:inbox AND NOT tag:killed). My daily routine to check the inbox doesn't need to load thousands of messages. Loading so many messages is rarely necessary unless I can vaguely recall any search terms or date ranges. For the specific case of checking the inbox one can currently use a date range, e.g. search tag:inbox AND NOT tag:killed AND (date:this_month OR date:last_month).
  • Explore the use of cursor-based pagination to load search results in batches.
page_limit = 100
page_threads = list(db.threads(querystring,
                             # Concept idea, not supported by notmuch2 bindings
                             limit=page_limit,
                             sort=(
                                 notmuch2.Database.SORT.NEWEST_FIRST,
                                 notmuch2.Database.SORT.THREAD_ID
                             ),
                             exclude_tags=self.exclude_tags))
yield from map(lambda t: t.threadid, page_threads)

while len(page_threads) = page_limit:
    last_thread = page_threads[-1]
    # Concept idea, querystring not supported by notmuch
    page_querystring = querystring + f" AND ((date:{last_thread.last}) AND thread:>{last_thread.threadid}) OR (date:..{last_thread.last} AND (NOT date:{last_thread.last})))"
    page_threads = list(db.threads(page_querystring,
                                 # Concept idea, not supported by notmuch2 bindings
                                 limit=page_limit,
                                 sort=(
                                     notmuch2.Database.SORT.NEWEST_FIRST,
                                     notmuch2.Database.SORT.THREAD_ID
                                 ),
                                 exclude_tags=self.exclude_tags))

    yield from map(lambda t: t.threadid, page_threads)

javiertury avatar Apr 26 '25 09:04 javiertury

@javiertury - Great idea. I just sent a PR to implement search limits.

lfos avatar Jul 30 '25 15:07 lfos

Fixed via https://github.com/pazz/alot/pull/1696.

lfos avatar Sep 02 '25 13:09 lfos