tomboy-ng icon indicating copy to clipboard operation
tomboy-ng copied to clipboard

Searching while typing

Open effelsberg opened this issue 3 years ago • 4 comments

The special problem of searching while typing is to keep the GUI repsonsive. There is already a threaded search implemented. However, the end of it is actively waited for.

My simple proposed solution splits the searching algorithm to that it can run freely. The GUI uses old-school polling to observe the searcher.

To disambiguate my new polled threaded search, I gave it the name "Thearch".

What to do in the GUI:

Add a timer with 10 ms interval and set the OnTimer event. It was automatically named Timer1Timer for me. I found that constant 10 ms poll does not load the CPU observably, so there are no means to kill it if there is nothing to do.

Set an OnChange event for Edit1. If the contents change it simply sets a variable to notifiy this event.

// New variables. Place and initialize them properly.
ThearchRequested : boolean;
Thearch1 : qword;

procedure TSearchForm.Edit1Change(Sender: TObject);
begin
  ThearchRequested := True;
end;

procedure TSearchForm.Timer1Timer(Sender: TObject);
var
  Thearch4 : qword;
begin
  if ThearchRequested then begin
      if NoteLister.StartThearchNotes(Edit1.Text) > 0 then begin
          ThearchRequested := False;
          Thearch1 := GetTickCount64;
          StatusBar1.SimpleText := 'Thearch!';
      end;
  end;
  NoteLister.RunThearchNotes();
  if NoteLister.ThearchNotesJustFinished then begin
      Thearch4 := GetTickCount64;
      ButtonClearFilters.Enabled := True;
      if Edit1.Text = '' then NoteLister.LoadListView(ListViewNotes, False)
      else NoteLister.LoadListView(ListViewNotes, True);
      NoteLister.LoadListNotebooks(ListBoxNotebooks.Items, True);
      StatusBar1.SimpleText := 'Thearch=' + inttostr(Thearch4 - Thearch1) + 'ms and we found ' + inttostr(ListViewNotes.Items.Count) + ' notes';
  end;
end;

Now that we have a polling GUI, the thearch must be implemented. It is basically the existing search split up in starting and running parts.

// New variables. Place, initialize and destroy them properly.
ThearchState : integer;  // 0 idle  1 running  2 just finished
// These are local in the non-polled search. However, they need to exist throughout the search.
ThearchTermList, ThearchFileList : TStringList;

// Like SearchNotes, but:
//   - TermList and FileList are no longer local
//   - Does not wait for the threads to finish.
// Returns the number of started threads.
function TNoteLister.StartThearchNotes(const Term: ANSIstring): longint;
var
    ThreadIndex : integer = 0;
    SearchThread : TSearchThread;
begin
    result := 0;
    if ThearchState = 1 then begin
        // Still searching. Do nothing (or restart the search if we know how ...)
    end else begin
        ThearchTermList.Free;
        ThearchTermList := TStringList.Create;
        ThearchFileList.Free;
        ThearchFileList := Nil;
        // Is try ... still needed here?
        try
            BuildSearchList(ThearchTermList, Term);
            if DebugMode then debugln('Empty Search Lists created');
            SearchNoteList.Free;
            SearchNoteList := TNoteList.Create;
            FinishedThreads := 0;
            ThreadLock := -1;
            ThearchFileList := FindAllFiles(WorkingDir, '*.note', false); // list contains full file names !
            while ThreadIndex < ThreadCount do begin
                SearchThread := TSearchThread.Create(True);         // Threads clean themselves up.
                SearchThread.ThreadBlockSize := ThearchFileList.Count div ThreadCount;
                SearchThread.Term_List := ThearchTermList;
                SearchThread.File_List := ThearchFileList;
                SearchThread.ResultsList := SearchNoteList;
                SearchThread.TIndex := ThreadIndex;
                SearchThread.CaseSensitive := Sett.SearchCaseSensitive;
                SearchThread.start();
                inc(ThreadIndex);
            end;
            ThearchState := 1;
            result := ThreadCount;
        finally
        end;
    end;
end;

// Needs to be polled and implements the finishing part.
procedure TNoteLister.RunThearchNotes();
begin
    if ThearchState = 1 then begin
        if FinishedThreads >= ThreadCount then begin
            SearchNoteList.Sort(@LastChangeSorter);
            ThearchState := 2;
        end;
    end;
end;

// Reports a finished search.
function TNoteLister.ThearchNotesJustFinished(): boolean;
begin
    result := False;
    if ThearchState = 2 then begin
        result := True;
        ThearchState := 0;
    end;
end;

How does it feel? Well, there seems to be a difference between my Linux (Ubuntu 18.04) and Windows 10 systems. The search feels great on both systems with my regular set of 500+ notes, so I created an artificial set of notes by splitting a Bible text into packs of 10 verses each and got 3000+ notes. On my Linux system, everything is still smooth, on Windows I sometimes experience a lag when filling the list in the GUI. The lag is not reproducible, it could be less than 100 ms or nearly a full second under the same searching conditions. There is of course a culprit in the fact that the search starts with the first typed letter, e.g. if I search for Noah, the letter "N" returns all notes, "No" still 1000+ notes.

It would be possible to accomodate slow GUI list updates in a similar polling fashion. I could try to go down this rabbit hole. I really like the auto-updating search, it would be great if it becomes a standard (maybe user settable) feature.

effelsberg avatar Oct 08 '20 22:10 effelsberg

Nice Work ! I have not read over it closely but looks like you have gotten right into it. One initial question, why a timer driven polling when you could trigger off a new search each time the user makes a change to edit1 ?

We would need to disable the timer if user moves focus away from edit1, most users leave -ng running in the background and a timer running would steal cpu from foreground apps and even prevent it being swapped out on system with memory issues.

I have long wanted to implement progressive searching like this but was worried about the impact on users with lots of notes. I have several I can name with 20k or more (!!?). So interesting that you find it ok still with 3K. Are you using a spinning hard disk or a SS one ?

I am personally cramped for time right now, a couple of weeks will be better again. But very interested!

(As an aside, draft pull request rules, focus on one issue per pull request, minimize impact on unconnected code, you get listed as an author !)

Davo

davidbannon avatar Oct 09 '20 02:10 davidbannon

Let me explain a bit ...

The Windows and Linux systems I use are both equipped with Core i5, 8GB, SSD as the main disk and HDD for bulk data. The notes are usually stored on the SSD but I can try the HDD instead for extra performance penalty. I also think that G-DATA on the Windows system is a performance penalty by itself.

I'm just starting to get proficient with Object Pascal and Lazarus, so I looked for a simple solution that I found to be tried-and-true. Under Tcl/Tk I often use schemes like after idle to schedule a computation in the main thread after the GUI has done its work. Using the Timer as proposed does the same thing, it starts a computation in the main GUI thread so I don't have to care about synchronization. This leaves the finishing of the search threads as the only synchronization point and that was easy, I also don't fear state machines. Waiting for the threads to finish is likewise the easy way, because the search result can be sorted before being displayed.

I'm not too proficient with the threads at the moment. I tried to terminate running threads and start them with new data but the programme crashed after the third letter I typed. I think I need some time to learn this kind of stuff.

It would also be possible to fill the GUI list of search results progessively, but that has to be done with care and in a smart way. For one, the list inserter must insert the entry at the proper position according to the sorting rules. The list could also be filled from the threads themselves using Synchronize. I don't know what the time penalty of changing to the main thread is. I think I'll try and see what happens.

My artificial set of notes is quite uniform as each note contains 10 Bible verses. Also, the Bible is not enough to fill 20k notes with individual data of varying lengths and I don't want to use lorem ipsum as a source. I could put some RFCs in notes, mixed with rock song lyrics and movie citations. That may give me a real-world set of notes of different lengths.

You're cramped for some weeks? Great, time for me to explore the field and try a pull request. Yes, I'm sure it's simple, but I'm more used to patch by mail (no, I'm not a kernel developer). Time to learn!

-- Stephan

effelsberg avatar Oct 09 '20 08:10 effelsberg

Informal partial success report

Informal means: No code examples at the moment.

I managed to terminate the threads without casualties. The key is to not enable FreeOnTerminate. This adds some extra burden to the programme to free the thread resources manually, but it also provides the necessary control to terminate threads at will and terminating is necessary to start a new search when the search term changes.

The next thing I tried was to not accumulate search results but instead to insert them into the GUI immediately via Synchronize. I used the date strings of the existing list entries as the sorting criterion, i.e. the list is searched for the proper place to insert the note instead of sorting it afterwards. I see that there seems to been an invisible subitem 1 that contains the note's ID, correct? It may be a good idea to add an extra invisible subitem that holds the last change date as a machine readable number, just in case the readable date is formatted like year/day/month.

It's already really nice to see the progressive search in action and it's quite smooth on my machines. In order to get out of the way of potential bottle necks (on somebody else's machine) I will try to move the GUI list filling away from the threads. This means the idea is that the threads will fill the non-GUI search list again, but every here and then the intermediate result will be captured in a thread-save way and the main loop will insert the result into the GUI in larger chunks. However, there is a chance that the context switch from a search thread into the main thread is not sooooooo expensive. It's probably more expensive if a really large number of notes is going to be inserted (when starting to type one or two letters), but having more than 100 search results usually means that the search term was not specific enough. Having only a few results also means having only few context switches. I think nobody will notice ...

Anyway, I will try this during the next days.

effelsberg avatar Oct 09 '20 22:10 effelsberg

hi effelsberg, I have just put some alpha packages on the tomboy-ng 0.34 release page that address this issue (searing while typing). They are relatively untested but I have been using much of the new code for some months with no apparent problems. Depending on how brave you feel, you might like to give it a go.

https://github.com/tomboy-notes/tomboy-ng/releases/tag/v0.34

Davo

davidbannon avatar Feb 01 '22 10:02 davidbannon

OK, all implemented in version 0.35 So, I'll close Thanks for the report. Davo

davidbannon avatar Dec 13 '22 12:12 davidbannon