terminal icon indicating copy to clipboard operation
terminal copied to clipboard

Update command palette search to prioritize "longest substring" match.

Open e82eric opened this issue 8 months ago • 5 comments

Summary of the Pull Request

Addresses: https://github.com/microsoft/terminal/issues/6693

Repurposed work from https://github.com/microsoft/terminal/pull/16586

  • I think the fzf algo fits here where it optimizes to find the optimal match based on consecutive chars and word boundaries.
    • There are some edge cases where a match with a small gap could get a higher score than a match of consecutive chars when the match with a gap has other bonuses (FirstChar * Boundary Bonus). This can be adjusted by adjusting the bonuses or removing them if needed.
  • From reading the thread in https://github.com/microsoft/terminal/issues/6693 it looked like you guys were leaning towards something like the fzf algo.
  • License file is now updated in https://github.com/nvim-telescope/telescope-fzf-native.nvim repository
    • https://github.com/nvim-telescope/telescope-fzf-native.nvim/pull/148
    • https://github.com/junegunn/fzf/issues/4310
  • Removed the following from the original implementation to minimize complexity and the size of the PR. (Let me know if any of these should be added back).
    • Query expressions "$:StartsWith ^:EndsWith |:Or !:Not etc"
    • Slab to avoid allocating the scoring matrix. This felt like overkill for the number of items in the command pallete.
    • Fallback to V1 algorithm for very long strings. I want to say that the command palette won't have strings this long.
  • Added the logic from GH#9941 that copies pattern and text chars to string for comparision with lstrcmpi
    • It does this twice now which isn't great...

Feel free to close if it doesn't make sense for what ever reason (E.g. License file still an issue, wrong direction for fix, too complex, etc).

palette2

References and Relevant Issues

https://github.com/microsoft/terminal/issues/6693

Detailed Description of the Pull Request / Additional comments

Validation Steps Performed

PR Checklist

  • [ ] Closes #xxx
  • [ ] Tests added/passed
  • [ ] Documentation updated
    • If checked, please file a pull request on our docs repo and link it here: #xxx
  • [ ] Schema updated (if necessary)

e82eric avatar Mar 18 '25 02:03 e82eric

Sorry for the spam open/close. Let me know if I you guys are interested in this and I can reopen it.

e82eric avatar Apr 10 '25 21:04 e82eric

Generally speaking, probably yes, but I personally haven't gotten a chance yet to look at it. Currently every member of the terminal team is unfortunately either working on another project or on sick leave, hence the overall delays since late last year. One of the things @zadjii-msft made was released just recently: https://learn.microsoft.com/en-us/windows/powertoys/command-palette/overview The project I'm working on will be released soon as well.

I plan to review this PR after I fixed the remaining major issues the the Preview/Canary version has. (Which is what has occupied what little time I had for Windows Terminal since then.)

lhecker avatar Apr 10 '25 23:04 lhecker

Sorry, didn't mean to rush you, was more worried I was spamming you guys with all of these fuzzy finding PRs. Please let me know if it doesn't make sense for what ever reason and I can close it.

e82eric avatar Apr 11 '25 02:04 e82eric

I'm also somewhat confused by the code's dissimilarity with the original fzf to be honest (that's not an issue in on itself; it just surprised me).

I think there are a couple of things that contributed this.

  • The original telescope-native port that I ported was done back in 2022 before the delimiter class type etc, and I think he had removed some stuff like the normalization.
  • When I ported it to c++ I mangled all of the variable names trying to understand how all of the dynamic programming and back tracking work.

Should I try to update use the latest fzf logic?

e82eric avatar Apr 29 '25 04:04 e82eric

That aside, for the upcoming project that I've worked on, I ported VS Code's fuzzy matcher to Rust. Here's the original: https://github.com/microsoft/vscode/blob/071eafaf5ab4de3814466b616fe00a66d2739ad2/src/vs/base/common/fuzzyScorer.ts If you'd like I could generate a C++ port of it next week based on my Rust version. The benefit of that code is that it's a lot simpler. The disadvantage is that fzf is a leagues better algorithm and if we had it, it would definitely be superior of course. I guess it's a trade-off.

Totally understand if the VS Code algo is more maintainable etc, let me know what direction you guys want to go. I am happy to try to port the typescript or the rust code or incorporate your port of the rust code.

e82eric avatar Apr 29 '25 04:04 e82eric

@check-spelling-bot Report

:red_circle: Please review

See the :open_file_folder: files view, the :scroll:action log, or :memo: job summary for details.

Unrecognized words (18)
AHave
aoaoabound
aobar
aoobar
didx
equalish
fbb
oaaaaaaaaaaabar
oaaabzzar
oabar
oabzazr
oaoabound
obar
oobar
Raz
ucase
zshc
zshcompctl
These words are not needed and should be removed abcd ABCDEFGHIJ abcdefghijk ABCDEFGHIJKLMNOPQRS ABCDEFGHIJKLMNOPQRST ABCDEFGHIJKLMNOPQRSTUVWXY allocs appium Argb asan autocrlf backported bytebuffer cac CLE codenav codepath commandline COMMITID componentization constness dealloc deserializers DISPATCHNOTIFY DTest entrypoints EVENTID FINDUP foob fuzzer fuzzyfinder hlocal hstring IInput Interner keyscan Munged numlock offboarded pids positionals Productize pseudoterminal remoting renamer roadmap ruleset SELECTALL somefile Stringable tearoff TODOs touchpad TREX Unregistering USERDATA vectorize viewports wsl

To accept these unrecognized words as correct and remove the previously acknowledged and now absent words, you could run the following commands

... in a clone of the [email protected]:e82eric/terminal.git repository on the command-palette-search-fzf branch (:information_source: how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/v0.0.24/apply.pl' |
perl - 'https://github.com/microsoft/terminal/actions/runs/15180185675/attempts/1'

Errors (1)

See the :open_file_folder: files view, the :scroll:action log, or :memo: job summary for details.

:x: Errors Count
:x: ignored-expect-variant 3

See :x: Event descriptions for more information.

:pencil2: Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

:microscope: You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. :wink:

If the flagged items are :exploding_head: false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it, try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

github-actions[bot] avatar May 22 '25 07:05 github-actions[bot]

@check-spelling-bot Report

:red_circle: Please review

See the :open_file_folder: files view, the :scroll:action log, or :memo: job summary for details.

Unrecognized words (18)
AHave
aoaoabound
aobar
aoobar
equalish
fbb
oaaaaaaaaaaabar
oaaabzzar
oabar
oabzazr
oaoabound
obar
oobar
Raz
separatorbonus
ucase
zshc
zshcompctl
These words are not needed and should be removed abcd ABCDEFGHIJ abcdefghijk ABCDEFGHIJKLMNOPQRS ABCDEFGHIJKLMNOPQRST ABCDEFGHIJKLMNOPQRSTUVWXY allocs appium Argb asan autocrlf backported bytebuffer cac CLE codenav codepath commandline COMMITID componentization constness dealloc deserializers DISPATCHNOTIFY DTest entrypoints EVENTID FINDUP foob fuzzer fuzzyfinder hlocal hstring IInput Interner keyscan Munged numlock offboarded pids positionals Productize pseudoterminal remoting renamer roadmap ruleset SELECTALL somefile Stringable tearoff TODOs touchpad TREX Unregistering USERDATA vectorize viewports wsl

To accept these unrecognized words as correct and remove the previously acknowledged and now absent words, you could run the following commands

... in a clone of the [email protected]:e82eric/terminal.git repository on the command-palette-search-fzf branch (:information_source: how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/v0.0.24/apply.pl' |
perl - 'https://github.com/microsoft/terminal/actions/runs/15229765423/attempts/1'

Errors (1)

See the :open_file_folder: files view, the :scroll:action log, or :memo: job summary for details.

:x: Errors Count
:x: ignored-expect-variant 3

See :x: Event descriptions for more information.

:pencil2: Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

:microscope: You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. :wink:

If the flagged items are :exploding_head: false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it, try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

github-actions[bot] avatar May 24 '25 18:05 github-actions[bot]

@check-spelling-bot Report

:red_circle: Please review

See the :open_file_folder: files view, the :scroll:action log, or :memo: job summary for details.

Unrecognized words (2)

equalish separatorbonus

These words are not needed and should be removed abcd ABCDEFGHIJ abcdefghijk ABCDEFGHIJKLMNOPQRS ABCDEFGHIJKLMNOPQRST ABCDEFGHIJKLMNOPQRSTUVWXY allocs appium Argb asan autocrlf backported bytebuffer cac CLE codenav codepath commandline COMMITID componentization constness dealloc deserializers DISPATCHNOTIFY DTest entrypoints EVENTID FINDUP foob fuzzer fuzzyfinder fzf hlocal hstring IInput Interner keyscan Munged numlock offboarded oob pids positionals Productize pseudoterminal remoting renamer roadmap ruleset SELECTALL somefile Stringable tearoff TODOs touchpad TREX Unregistering USERDATA vectorize viewports wsl

To accept these unrecognized words as correct and remove the previously acknowledged and now absent words, you could run the following commands

... in a clone of the [email protected]:e82eric/terminal.git repository on the command-palette-search-fzf branch (:information_source: how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/v0.0.24/apply.pl' |
perl - 'https://github.com/microsoft/terminal/actions/runs/15229796231/attempts/1'

Errors (1)

See the :open_file_folder: files view, the :scroll:action log, or :memo: job summary for details.

:x: Errors Count
:x: ignored-expect-variant 3

See :x: Event descriptions for more information.

:pencil2: Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

:microscope: You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. :wink:

If the flagged items are :exploding_head: false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it, try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

github-actions[bot] avatar May 24 '25 18:05 github-actions[bot]

Updated the PR to use the VsCode algo from the Edit repo, wanted to see how the results compared to the fzf port. It was more or less like you said where the matching was not quite as good but the implementation was much simpler. Let me know which direction you guys want to go.

There are a couple of differences from the rust code.

  • I didn't add the arena allocator, I had removed the Slab one from the fzf implementation to make things simpler. Should I add this back?
  • I wasn't able to get the case folding from icu::fold_case to work, so I stuck with u_foldCase. When the folding resulted in a expansion it would match a extra position or 1 fewer depending on if the expansion was in the query or the haystack.
    • Ex. query: "fuss" haystack: "Fußball" would result in positions [0, 1, 2, 3, 4] causing the b to be highlighted and ss to count as 2 matches in the score. https://github.com/e82eric/edit/blob/fuzzy-cplusplus-port-tests/src/fuzzy.rs#L359
    • Let me know if I am thinking about this wrong.
  • I am still splitting the query into terms using the space char so that "anta sp" from the example would still match.
    • It looked like VsCode was doing the same

Difference I noted from the fzf code.

  • Not having the gap penalty was more noticable than I would have expected. Matches without consecutive chars seemed kind of random.
    • Ex. query: "sp" hakstack1: "sapbbbbbbb" haystack2: "sabbbbbbbp" would both have the same score;
    • Or if there are multiple non consecutive sequences in a match fzf would pick the one with the least distance between chars.
  • The traceback could pick a sequence with a lower total score when the lower total score has the start of line bonus
    • Ex. query: "bar" haystack: "Boo Author Raz Bar" would match on [0, 4, 11] even though the bar at the end has a high total score. https://github.com/e82eric/terminal/blob/command-palette-search-fzf/src/cascadia/ut_app/FuzzyTests.cpp#L379
      • VsCode has an explanation here. https://github.com/microsoft/vscode/issues/170353

e82eric avatar May 27 '25 00:05 e82eric

It was more or less like you said where the matching was not quite as good but the implementation was much simpler. Let me know which direction you guys want to go.

Things have finally settled down in the edit repository and I now have plenty time to review your PR. If you prefer the fzf approach, I'd be happy to review that and help you with finishing up the PR if needed. If you prefer the VS Code approach, I can also review that. What would you prefer?

I'll make sure it gets into main within this week. 🙂

I didn't add the arena allocator, I had removed the Slab one from the fzf implementation to make things simpler. Should I add this back?

Nah, it's fine this way. Of course, a slab/arena would be ideal, but it doesn't have to be used.

When the folding resulted in a expansion it would match a extra position or 1 fewer depending on if the expansion was in the query or the haystack.

Oh f... that's an issue I haven't considered. 😢 The ICU source code for case folding is here: https://github.com/unicode-org/icu/blob/454ce01479fdf4435e0a6600c38fc205a898501c/icu4c/source/common/ustrcase.cpp#L208-L304 The Edits object can be used for mapping folded offsets back to the original offsets, but that feature is unavailable via the C API. Since I probably can't implement fzf in edit (binary size), I'll probably have to port that internal C++ code to Rust.

lhecker avatar May 27 '25 11:05 lhecker

Things have finally settled down in the edit repository and I now have plenty time to review your PR. If you prefer the fzf approach, I'd be happy to review that and help you with finishing up the PR if needed. If you prefer the VS Code approach, I can also review that. What would you prefer?

I guess my preference is to use the fzf port, having the gap penalties seemed to improve the matching quite a bit. I would totally understand if you guys would prefer to go with the VsCode one to avoid adding the extra complexity to the code base. I'll revert the PR back to the fzf version, let me know if that doesn't make sense and I can switch back to the VsCode version if needed.

I made a couple of attempts tonight at supporting the case folding expansions in the C++ port of the VsCode algo and failed each time. But I think I have a strategy that will work now, I want to say I should be able to get that working in a night or 2 and then another night to add it to the fzf version. Does that timeline work? (I don't want to miss the window where you have extra time for PR reviews).

e82eric avatar May 28 '25 07:05 e82eric

Let's say we go with fzf, since you prefer its behavior. I just read your fzf implementation in fbf76f2d26986e6f4290336afe415f63d7ed37aa and it actually looks absolutely fine to me (it uses the per-codepoint fold-case function which is "good enough" IMO). I'd merge it as-is and improve it as needed later on. How about we

  • backup your current changes
  • revert to fbf76f2d26986e6f4290336afe415f63d7ed37aa and merge that
  • and then improve upon it later?

Or did I miss an issue with fbf76f2d26986e6f4290336afe415f63d7ed37aa? Otherwise, I'd just test it locally first and then approve it. 🙂

lhecker avatar May 28 '25 12:05 lhecker

Sounds good! Just updated.

e82eric avatar May 28 '25 13:05 e82eric

Pushed one more set of updates.

I had a logic issue here where the query wasn't being passed to the fuzzy searcher (https://github.com/microsoft/terminal/commit/fbf76f2d26986e6f4290336afe415f63d7ed37aa wasn't working at all) https://github.com/microsoft/terminal/pull/18700/commits/8bacdcf15a0d99e40aa9ade2fcc2d93e5802e735 should fix that

I also found a issue where I wasn't calculating the bonus for camel case correctly. https://github.com/microsoft/terminal/pull/18700/commits/b2191dd5297457361229697fb1c40443e21dcc25

The other thing that I am worried about is that I still haven't figured out how to run LocalTests_TerminalApp which will probably have some failures from the new algo.

e82eric avatar May 29 '25 06:05 e82eric

@check-spelling-bot Report

:red_circle: Please review

See the :open_file_folder: files view, the :scroll:action log, or :memo: job summary for details.

Unrecognized words (1)

exectation

These words are not needed and should be removed abcd ABCDEFGHIJ abcdefghijk ABCDEFGHIJKLMNOPQRS ABCDEFGHIJKLMNOPQRST ABCDEFGHIJKLMNOPQRSTUVWXY allocs appium Argb asan autocrlf backported bytebuffer cac CLE codenav codepath commandline COMMITID componentization constness dealloc deserializers DISPATCHNOTIFY DTest entrypoints EVENTID FINDUP foob fuzzer fuzzyfinder hlocal hstring IInput Interner keyscan lstrcmpi Munged numlock offboarded oob pids positionals Productize pseudoterminal remoting renamer roadmap ruleset SELECTALL somefile Stringable tearoff TODOs touchpad TREX Unregistering USERDATA vectorize viewports wsl

To accept these unrecognized words as correct and remove the previously acknowledged and now absent words, you could run the following commands

... in a clone of the [email protected]:e82eric/terminal.git repository on the command-palette-search-fzf branch (:information_source: how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/v0.0.24/apply.pl' |
perl - 'https://github.com/microsoft/terminal/actions/runs/15349302503/attempts/1'

Errors (1)

See the :open_file_folder: files view, the :scroll:action log, or :memo: job summary for details.

:x: Errors Count
:x: ignored-expect-variant 3

See :x: Event descriptions for more information.

:pencil2: Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

:microscope: You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. :wink:

If the flagged items are :exploding_head: false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it, try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

github-actions[bot] avatar May 30 '25 14:05 github-actions[bot]

I went through the PR and played around with it. It works quite well and is subtly better than what we currently have for the cases I need. What I'm missing mostly is some Unicode normalization like fzf has: https://github.com/junegunn/fzf/blob/master/src/algo/normalize.go This could be part of https://github.com/microsoft/edit/tree/main/tools/grapheme-table-gen in the future so we can generate a small trie for the mapping.

I made several changes, many of them subjective, but I think it now better aligns with the existing code style of the project. The only commit that's a little unlike the others is "Simplify & improve performance of UTF16<>UTF32 routines". It only works well if the needle size and hit count is moderately small, due to the O(n^2) behavior. Is that a fair assumption?

lhecker avatar May 30 '25 14:05 lhecker

The other thing that I am worried about is that I still haven't figured out how to run LocalTests_TerminalApp which will probably have some failures from the new algo.

Oh, right, I forgot your question. First you need to build these two projects:

image

Then you do this in PowerShell:

Import-Module .\tools\OpenConsole.psm1
Invoke-OpenConsoleTests -Test localTerminalApp

For me this currently fails in 18 tests, but only 3 of those are due to your PR. We're not doing a great job of maintaining these tests unfortunately. You can tell it to only run your relevant tests like this:

Invoke-OpenConsoleTests -Test localTerminalApp -TaefArgs /name:TerminalAppLocalTests::FilteredCommandTests::*

Edit: I fixed them.

lhecker avatar May 30 '25 14:05 lhecker

/azp run

DHowett avatar May 30 '25 16:05 DHowett

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar May 30 '25 16:05 azure-pipelines[bot]

/azp run

DHowett avatar May 30 '25 16:05 DHowett

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar May 30 '25 16:05 azure-pipelines[bot]

The only commit that's a little unlike the others is "Simplify & improve performance of UTF16<>UTF32 routines". It only works well if the needle size and hit count is moderately small, due to the O(n^2) behavior. Is that a fair assumption?

Thank you guys! (I might be a little slow to respond today).

I think that makes sense, (n is the positions in this case?). I think the only way n could be large is if there was a really large haystack (maybe a snippet?) and the user typed every single character and they all matched.

This might have broken the control that highlights the text though, I think it wants both positions in the surrogate pair to be included in the run or it displays the match as [][]. Do I have that right (I might just be misunderstanding how WinUi works). If so I think I could update the new function to include both positions for the pair (I like the new approach a lot better that the map stuff I was doing before).

For this before it was returning { 6, 5, 4, 3, 2, 1, 0 } and now { 6, 5, 4, 3, 1, 0 }

void FzfTests::SurrogatePair()
    {
        AssertScoreAndPositions(
            L"N😀ewer",
            L"N😀ewer tab",
            ScoreMatch * 6 + BonusBoundary * BonusFirstCharMultiplier + BonusConsecutive * BonusFirstCharMultiplier * 5,
            { 6, 5, 4, 3, 2, 1, 0 });
    }

This was without the 2 position image

This was with the 2 position image

e82eric avatar May 30 '25 17:05 e82eric

If so I think I could update the new function to include both positions for the pair (I like the new approach a lot better that the map stuff I was doing before).

If you could try fixing it, I would greatly appreciate it! Also, please absolutely feel free to revert any of my changes. I consider them mostly optional.

lhecker avatar May 30 '25 18:05 lhecker

If so I think I could update the new function to include both positions for the pair (I like the new approach a lot better that the map stuff I was doing before).

If you could try fixing it, I would greatly appreciate it! Also, please absolutely feel free to revert any of my changes. I consider them mostly optional.

Will do, all of the changes looked like they improved the code to me (I just remember stepping on the extra position thing earlier).

e82eric avatar May 30 '25 18:05 e82eric

The other thing that I am worried about is that I still haven't figured out how to run LocalTests_TerminalApp which will probably have some failures from the new algo.

Oh, right, I forgot your question. First you need to build these two projects:

image

Then you do this in PowerShell:

Import-Module .\tools\OpenConsole.psm1
Invoke-OpenConsoleTests -Test localTerminalApp

For me this currently fails in 18 tests, but only 3 of those are due to your PR. We're not doing a great job of maintaining these tests unfortunately. You can tell it to only run your relevant tests like this:

Invoke-OpenConsoleTests -Test localTerminalApp -TaefArgs /name:TerminalAppLocalTests::FilteredCommandTests::*

Edit: I fixed them.

That worked, I am able to run them locally now. Thank you!!

e82eric avatar Jun 01 '25 00:06 e82eric

What I'm missing mostly is some Unicode normalization like fzf has: https://github.com/junegunn/fzf/blob/master/src/algo/normalize.go This could be part of https://github.com/microsoft/edit/tree/main/tools/grapheme-table-gen in the future so we can generate a small trie for the mapping.

This may be way beyond my understanding of unicode. Would that be adding something like this (from the fzf port you had done)?

    // Normalize rune using ICU NFD and removing combining marks (Mn)
    // This aims to replicate the Go version's accent removal.
    UChar32 normalizeRune(const UChar32 c)
    {
        if (isAscii(c))
        {
            return c;
        }

        static const auto norm = []() {
            UErrorCode status = U_ZERO_ERROR;
            return unorm2_getNFDInstance(&status);
        }();
        if (!norm)
        {
            return c;
        }

        UChar decomp[8];
        UErrorCode status = U_ZERO_ERROR;
        const auto len = unorm2_getDecomposition(norm, c, &decomp[0], ARRAYSIZE(decomp), &status);
        if (U_FAILURE(status) || len <= 0 || !isAscii(decomp[0]))
        {
            return c;
        }

        return decomp[0];
    }

e82eric avatar Jun 01 '25 01:06 e82eric

Yes, although I'm not entirely sure if that would be fast enough and whether it's really the same thing that fzf does. That's something we'd have to experiment with over time.

lhecker avatar Jun 02 '25 13:06 lhecker

/azp run

DHowett avatar Jun 02 '25 15:06 DHowett

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Jun 02 '25 15:06 azure-pipelines[bot]

/azp run

DHowett avatar Jun 02 '25 19:06 DHowett