cherrytree icon indicating copy to clipboard operation
cherrytree copied to clipboard

[Feature request] Smarter, simpler search

Open Daeron08 opened this issue 5 years ago • 14 comments

As far as I know, the default search is unable to handle queries consisting multiple words if:

  • the words are present, but are out of order
  • there are additional characters between them

For example, let's say our note consists of these lines:

View network interfaces ip a

If you search for view network, the line will be found (exact match). If you search for view interfaces, the line will NOT be found (additional characters inbetween the queried words). If you search for network view, the line will NOT be found (words are out of order).

Of course, one can try using regex: view.*interfaces --> solves the issue of characters between the two words network view|view network --> solves the issue of words out of order

However, this is a really cumbersome solution that adds a lot of overhead to quickly jumping between relevant notes. 95% of the time I just want a quick query and results returned as long as the words are both present in any form. I believe this is how OneNote works as well. Is this currently not possible, or am I missing something?

Is there any chance of implementing this kind of search either as the default or a separate option? This is the main thing holding me back from adopting an otherwise very promising software.

Daeron08 avatar Oct 17 '20 00:10 Daeron08

Not personally convinced, or I miss something. Which other application do you take into consideration for this proposal ? I am used to Kate editor, and not lost with CT which includes regex search.

If you really want to strictly find view network (and not network view), what would be your proposal ?

ghost avatar Oct 18 '20 21:10 ghost

I mentioned Microsoft OneNote which works the way I described (referring mainly to the global search CTRL+E). It'll match as many relevant results as possible, regardless of word order or additional characters inbetween search terms.

Example page content:

View network interfaces

OneNote search results: view interfaces -> page found network view -> page found

Cherrytree search results: view interfaces -> page not found network view -> page not found

Cherrytree with regex search results: view.*interfaces -> page found network view|view network -> page found

Again, emphasis on the fact that while regex solves the issue, it slows down the workflow by a lot having to type everything twice and it only gets worse with more words used.

Switching between strict search and "smart" search could be handled via a checkbox (same as "word search" etc.) or having it assigned to a different shortcut.

Daeron08 avatar Oct 20 '20 12:10 Daeron08

What you are looking for is a specific word search option (not so common AFAIK) The space character ' ' in a text string does not mean 'or', space is a character like any character.

You forgot to answer this one: If you really want to strictly find view network (and not network view), what would be your proposal ?

ghost avatar Oct 20 '20 13:10 ghost

The most straightforward choice would be to enclose the term between quotation marks (works the same in OneNote and most internet search engines): "view network"

Alternatively with the use of a checkbox/radial button one could switch between the current implementation of search (which is strict in nature) or the new version which is not.

I'd argue the "word" search I described is actually fairly common: search engines.

Daeron08 avatar Oct 23 '20 07:10 Daeron08

Then regex \"view network\" to find "view network" string ?

actually fairly common: search engines

I think you mean common for WEB browser search engines. I never had this feature in any application I've used (in a linux environment). I am not familiar with Word editor, but according to rich-text-editors (2008) it seems that macros are necessary to find multiple words: 1 - you define the strings: WordCollection(0) = "very" ; WordCollection(1) = "just" ; WordCollection(2) = "of course" 2 - you run a macro

Now your request is clarified, let's see other comments.

ghost avatar Oct 23 '20 07:10 ghost

I would love this feature too, but probably even smarter than described. I've got over 2,000 nodes in my CT document, and I often struggle to remember exact phrases I used when writing about something that I later want to find. When writing, I try to use tags and phrases that I'll think to search for, but I don't always think to do that and can't always predict what I'll want to find. So a flexible search would be massively beneficial to me.

I would like an advanced search engine that looks for nodes containing as many of the words as possible, scoring a node higher when the words are close together, in the same order, or occur more frequently; perhaps falling back to word stemming to match variants of the words; and even (optionally) searching for synonyms would be nice, but maybe that's a separate feature.

rmwiseman avatar Oct 05 '21 07:10 rmwiseman

In support of the comment of rmwiseman: cherrytree is my main memory aid for many things, but I am sometimes struggling to retrieve notes, e.g. because CT is only searching forward or backward.

karel1 avatar Dec 31 '21 12:12 karel1

The correct regex syntax to find multiple words (first request / 17 Oct 2020) is words separated by vertical line (| = or).

word1|word2|word3

I would like an advanced search engine that looks for nodes containing as many of the words as possible, scoring a node higher when the words are close together, in the same order, or occur more frequently; perhaps falling back to word stemming to match variants of the words; and even (optionally) searching for synonyms would be nice, but maybe that's a separate feature.

Maybe a new thread would be necessary to precisely propose and specify a new feature after clarification, with examples to illustrate, step by step.

Just adding a 'word count feature' to easily identify the most relevant nodes, before getting the details, should not be very difficult, and could be a first step before much more complex request in contradiction to this "Smarter, simpler search" thread.

ghost avatar Dec 31 '21 19:12 ghost

Please try the test version https://www.giuspen.net/cherrytree/#downl so-called v1.5.0+47 and look for the new options in the search dialog

Image

giuspen avatar Aug 23 '25 11:08 giuspen

Many thanks @giuspen. I tried out the new version, and although it runs OK and I can edit notes, I tried a search using the new option and it crashed…

Gtk-Message: 14:23:09.844: Failed to load module "xapp-gtk3-module"
/usr/lib/x86_64-linux-gnu/gvfs/libgvfscommon.so: undefined symbol: g_task_set_static_name
Failed to load module: /usr/lib/x86_64-linux-gnu/gio/modules/libgvfsdbus.so
[2025-08-29 14:23:09.919] [che] [debug] /home/richard/.config/cherrytree/config.cfg parsed
[2025-08-29 14:23:10.066] [gtk] [critical] gtkmm: object `RecentDocs' not found in GtkBuilder file.
[2025-08-29 14:23:10.066] [gtk] [critical] gtkmm: Gtk::Builder: widget `RecentDocs' was not found in the GtkBuilder file, or the specified part of it.
[2025-08-29 14:23:10.066] [gtk] [critical] Gtk::Builder::get_widget(): dynamic_cast<> failed.
[2025-08-29 14:23:10.083] [che] [debug] autosave on 1 min
[2025-08-29 14:23:10.922] [che] [debug] Node 3105 > 29 Fri
[2025-08-29 14:23:11.197] [che] [debug] fs::download_file: start downloading https://raw.githubusercontent.com/giuspen/cherrytree/master/debian/changelog
[2025-08-29 14:23:35.937] [gtk] [warning] atk-bridge: get_device_events_reply: unknown signature
[2025-08-29 14:24:10.221] [che] [debug] autoSaveCounter->0
[2025-08-29 14:24:10.221] [che] [debug] autosave needed
[2025-08-29 14:25:10.222] [che] [debug] autoSaveCounter->0
[2025-08-29 14:25:10.222] [che] [debug] autosave no need
[2025-08-29 14:26:10.216] [che] [debug] autoSaveCounter->0
[2025-08-29 14:26:10.216] [che] [debug] autosave no need
[2025-08-29 14:27:10.215] [che] [debug] autoSaveCounter->0
[2025-08-29 14:27:10.215] [che] [debug] autosave no need
[2025-08-29 14:28:10.271] [che] [debug] autoSaveCounter->0
[2025-08-29 14:28:10.272] [che] [debug] autosave no need
[2025-08-29 14:29:10.220] [che] [debug] autoSaveCounter->0
[2025-08-29 14:29:10.220] [che] [debug] autosave no need
[2025-08-29 14:30:10.271] [che] [debug] autoSaveCounter->0
[2025-08-29 14:30:10.271] [che] [debug] autosave no need
[2025-08-29 14:31:10.271] [che] [debug] autoSaveCounter->0
[2025-08-29 14:31:10.271] [che] [debug] autosave no need
[2025-08-29 14:32:10.269] [che] [debug] autoSaveCounter->0
[2025-08-29 14:32:10.269] [che] [debug] autosave no need
[2025-08-29 14:33:10.271] [che] [debug] autoSaveCounter->0
[2025-08-29 14:33:10.272] [che] [debug] autosave no need
[2025-08-29 14:34:10.214] [che] [debug] autoSaveCounter->0
[2025-08-29 14:34:10.214] [che] [debug] autosave no need
[2025-08-29 14:35:10.219] [che] [debug] autoSaveCounter->0
[2025-08-29 14:35:10.220] [che] [debug] autosave no need
[2025-08-29 14:36:10.272] [che] [debug] autoSaveCounter->0
[2025-08-29 14:36:10.273] [che] [debug] autosave no need
[2025-08-29 14:37:10.272] [che] [debug] autoSaveCounter->0
[2025-08-29 14:37:10.272] [che] [debug] autosave no need
[2025-08-29 14:38:10.272] [che] [debug] autoSaveCounter->0
[2025-08-29 14:38:10.273] [che] [debug] autosave no need
[2025-08-29 14:43:25.882] [che] [debug] autoSaveCounter->0
[2025-08-29 14:43:25.882] [che] [debug] autosave no need
[2025-08-29 14:43:45.602] [che] [debug] auto reg_exp (?=.*mosquitto)(?=.*configuration).*
Bus error (core dumped)

Something I did wrong?

rmwiseman avatar Sep 01 '25 07:09 rmwiseman

Hi @rmwiseman thanks for testing! Is this crash systematic? Can you reproduce it with multiple different searches? I would like to either be able to reproduce myself (ideal if you give me test data and detail your linux distribution/version) or that you help me further building yourself from the source code as described on https://github.com/giuspen/cherrytree/blob/master/BUILDING.md and then https://github.com/giuspen/cherrytree/blob/master/BUILDING.md#to-generate-a-backtrace-for-a-crash-bug-report as with the crash backtrace I can better understand what is going on

giuspen avatar Sep 01 '25 20:09 giuspen

Thanks for getting back to me @giuspen. I tried again with the app image and this time it didn't crash — I should have persisted yesterday!

Oddly, I'm not sure it is working properly, however.

If I search for mosquitto config with the new option selected, I get three results, all of which are the text "Mosquitto config" (exactly that text, minus the quotes), two in a note's body, one in a note's title.

If I search in exactly the same way but for config mosquitto, I get no matches. Similarly, if I search for aaa ccc bbb where aaa bbb ccc is in the document, it doesn't find anything.

I also noticed that if I search for two words that aren't on the same line, it doesn't find them. When I'm searching for a particular note, I often need to search across multiple lines.

The regexp solution is clever, but I think we still need a more flexible search that doesn't require terms to be on the same line.

IMPORTANT UPDATE!

Some of the comments above are symptoms of another issue, I think.

I've noticed that the log shows e.g. [2025-09-02 10:56:25.763] [che] [debug] auto reg_exp (?=.*aaa)(?=.*ccc)(?=.*bbb).* when performing a search with the new option, but only the first search! Subsequent searches don't log the regexp and only find a match if the words are exactly as searched for. So I assume the regexp isn't being generated for subsequent searches.

The comment about searching across multiple lines still stands, but if I want to search using the new option, I have to restart CherryTree first.

rmwiseman avatar Sep 02 '25 10:09 rmwiseman

Hi @rmwiseman thanks for your tests,

I can confirm the issue that it is not currently looking on multiple lines at the moment, I will see if I can improve that, but about the issue that it is working only on the first search I thought I fixed that already. Have you been testing so-called v1.5.0+47 or so-called v1.5.0+60 from https://www.giuspen.net/cherrytree/#downl ? In the latter, the search will look like the screenshot below

Image

If it still fails subsequent searches for you with v1.5.0+60 please describe better the search options you are setting in your first and subsequent searches

giuspen avatar Sep 02 '25 19:09 giuspen

Ah, I've been using CherryTree-1.5.0+47-x86_64.AppImage — apologies, I didn't realise I was behind the curve! I'll have a try with the latest one.


It looks good now. The searches I've tried all work. I'll keep using that version throughout today and update if there are any issues.

rmwiseman avatar Sep 03 '25 06:09 rmwiseman