quickmove-extension icon indicating copy to clipboard operation
quickmove-extension copied to clipboard

[feature request] *partial match* on *full path*

Open liar666 opened this issue 1 year ago • 1 comments

Hi,

Thanks for the very nice tool. I used Nostalgy++ for a while, but I prefer your tool for the beautiful GUI and its ability to handle correctly custom key bindings.

However, there a thing that I miss a lot from Notalgy++, is the ability to match full path of directories and even with partial match.

e.g. I'm a teacher. There are several courses I teach every year. I teach courses with similar titles to various audiences. So I have a directory structure similar to: <year_number>/<course_audience>/ ; example : 2024-2025/Master/job.

When I select a target directory for batch moving emails, I would like to be able to type "25 ma jo" and get this examples directory selected over all the other ones.

liar666 avatar Aug 30 '24 10:08 liar666

This seems reasonable. The search code is here, and would need to be modified to match on parts. Note this gets called a lot, so performance optimization is key. I'm marking this as a good first bug in case anyone wants to pick it up.

https://github.com/kewisch/quickmove-extension/blob/e5490f3ae4bb806c1cd016335b1e35743c2a97b6/src/popup/baseItemList.js#L483-L495

kewisch avatar Sep 11 '24 05:09 kewisch

Could it be something like this?

https://github.com/micz/quickmove-extension/commit/d7fe72e1b9c3f90b22d8abacb55d4991bee80b02

It checks if any word in the search strings matches any part of the folder path. If this approach works for you, I can create a PR.

micz avatar Jan 06 '25 22:01 micz

Maybe this

if (!hasAccent) {
    itemText = itemText.normalize("NFD").replace(DIACRITICS, "");
}

has to be applied also to the path before spitting in segments?

micz avatar Jan 06 '25 22:01 micz

Perhaps it would be better to enable this new behavior as an option?

micz avatar Jan 07 '25 06:01 micz

Something like that might work. You might need to instead traverse the folder nodes to root instead of splitting by /, because it is unfortunately possible that folder names have a / as well (e.g. Owl).

What I am worried about is performance though. This makes the lookup O(n^2), and it is done for all folders. Ideally we'd build some sort of search tree or full text indexing, though I know that is a bit more work.

Do you see a way to optimize performance here?

kewisch avatar Jan 07 '25 10:01 kewisch

Something like that might work. You might need to instead traverse the folder nodes to root instead of splitting by /, because it is unfortunately possible that folder names have a / as well (e.g. Owl).

Do you think that could be a problem? The folder matching should still occur in this case. Also, what does the '/' in the folder name represent? How is the folder path displayed in the add-on in this case? Are you representing it differently?

What I am worried about is performance though. This makes the lookup O(n^2), and it is done for all folders. Ideally we'd build some sort of search tree or full text indexing, though I know that is a bit more work.

Given that we are handling folders and not messages, do you think the average user will notice any impact? In my test on a very old machine (from 2011), I observed no difference between the original version and this one. Perhaps placing this new algorithm behind a preference setting could be beneficial, not only due to the different behavior (this version displays folders when the string appears anywhere in the path, not just in the final folder name), but also to address potential performance concerns.

Do you see a way to optimize performance here?

Maybe in this line https://github.com/micz/quickmove-extension/blob/d7fe72e1b9c3f90b22d8abacb55d4991bee80b02/src/popup/baseItemList.js#L528 the use of some could be replaced with a loop that breaks at the first occurrence?

micz avatar Jan 07 '25 17:01 micz

Something like that might work. You might need to instead traverse the folder nodes to root instead of splitting by /, because it is unfortunately possible that folder names have a / as well (e.g. Owl).

Do you think that could be a problem? The folder matching should still occur in this case. Also, what does the '/' in the folder name represent? How is the folder path displayed in the add-on in this case? Are you representing it differently?

Yes, this has been an issue, there was an Owl issue here recently. Replacement is simple though see common/foldernode.js:

item.fullNameParts.some(segment => segment.includes(word));

It might also make sense to have an Iterator version of fullNameParts so that walking stops when a segment is found.

What I am worried about is performance though. This makes the lookup O(n^2), and it is done for all folders. Ideally we'd build some sort of search tree or full text indexing, though I know that is a bit more work.

Given that we are handling folders and not messages, do you think the average user will notice any impact? In my test on a very old machine (from 2011), I observed no difference between the original version and this one. Perhaps placing this new algorithm behind a preference setting could be beneficial, not only due to the different behavior (this version displays folders when the string appears anywhere in the path, not just in the final folder name), but also to address potential performance concerns.

I've seen a few reports saying that generating the folder list is very slow, so I do believe this is the case. Maybe when there are many many many folders on a slow machine. If we make it a default-off pref then I think we could risk it and wait until someone complains that it is slow when they turn it on.

I don't know off hand if it would be faster, but we could also maybe cache a string in the folder nodes lazily. Each node has a full text string that includes its own name and the name of the parents, so when we want to check if a folder should be in the list we just need to run one includes call on the cached string. We calculate this for each node the first time it occurs. This way we don't have an initial load penalty to cache the items. The first search would be just as slow, but as folks continue typing the subsequent searches would be faster. Would that make sense?

kewisch avatar Jan 07 '25 20:01 kewisch

If we make it a default-off pref then I think we could risk it and wait until someone complains that it is slow when they turn it on.

I was thinking exactly about that.

I don't know off hand if it would be faster, but we could also maybe cache a string in the folder nodes lazily.

That's a great idea. Maybe there is no need to split the full path string either. I'll do some tests.

micz avatar Jan 07 '25 22:01 micz

I modified the code: https://github.com/micz/quickmove-extension/commit/80d41af6c656ad02c6db4a054a2977a79219f454

There is no need to split the path. I tested it with 5000 folder and there is no delay. I think it should be used a default-off pref because the behaviour changes as described here: https://github.com/kewisch/quickmove-extension/issues/169#issuecomment-2575909132

Let me know if it's ok for you and how you want to procede.

micz avatar Jan 08 '25 20:01 micz

Sweet, thanks for testing, this is good news. I'm a bit reluctant to use the path though for matching. I believe there are some edge cases where the folder path does not match the display name. I was certain I had an example from a past issue but couldn't find it. In theory though, the display name can differ from the folder path. One potential case is when the folder name is localized, e.g. the path is INBOX/Drafts but in Thunderbird it is displayed in German as Entwürfe. The API docs also say "Although paths look predictable, never guess a folder’s path, as there are a number of reasons why it may not be what you think it is" which could be related.

I'm not 100% certain this is true, it might be worth asking in the add-on developer's channel if it is safe to use the path to match folder display names or if my suspicion is true. The safe path is to actually use the name and iterate upwards.

kewisch avatar Jan 09 '25 19:01 kewisch

I understand your point, you're right. Also, this is your add-on, so you're the boss. :thumbsup: :smile:

I'll try to use the lazy loading you suggested to create a string iterating upwards using the display names. This should be fastest than creating an array to traverse with every word. Then I'll test it with the 5000 folders test environment. Is it ok for you this approach?

micz avatar Jan 09 '25 20:01 micz

Yep that works for me, thank you so much for persisting on this!

kewisch avatar Jan 09 '25 21:01 kewisch

I added a new property to FolderNode, #_fullSearchString, and then implemented a getter, with lazy laoding of the property. Finally, I use it in the repopulate method.

In the test with 5000 folders, it is as fast as before.

I hope I got it right this time!

micz avatar Jan 10 '25 21:01 micz

If that https://github.com/kewisch/quickmove-extension/issues/169#issuecomment-2584330744 is right, the next steps could be:

  1. Sync my fork with you last commits
  2. Add the corresponding option
  3. Modify the repopulate method to use the option
  4. Submit a PR

In step 3, do you prefer a single loop with an if statement inside it, or two separate loops with the if condition outside? The former avoids code duplication, but the latter might be faster.

micz avatar Jan 11 '25 15:01 micz

I created the pull request #200. Let me know! 👍

micz avatar Jan 16 '25 21:01 micz

Fixed by #200

kewisch avatar Jan 18 '25 18:01 kewisch