blink.cmp
blink.cmp copied to clipboard
Fuzzy/Sorting Mega issue
Note from Saghen
Please feel free to post any of your undesirable sorting behaviors below! Please try setting fuzzy.use_frecency/use_proximity = false before posting and ensure you're on the latest version.
Make sure you have done the following
- [x] Updated to the latest version of
blink.cmp - [x] Searched for existing issues and documentation (try
<C-k>on https://cmp.saghen.dev)
Bug Description
Opening the completion menu in vscode / cursor in a Typescript file gives:
In Neovim with blink.cmp, it looks like this:
In a browser DevTools console, the order matches vscode.
I am trying to get blink.cmp to match the vscode order, which through testing looks like it should be:
-
Lowercase - Local
-
Lowercase - Remote (must be imported)
-
Emmets
-
Snippets
-
Lowercase - Local, deprecated
-
Lowercase - Remote, deprecated
-
Uppercase - Local
-
Uppercase - Remote
-
Uppercase - Local, deprecated
-
Uppercase - Remote, deprecated
I currently have the following sort order set:
sorts = {
-- Always prioritize exact matches, case-sensitive.
"exact",
-- Sort by Fuzzy matching score.
"score",
-- Sort by `sortText` field from LSP server, defaults to `label`.
-- `sortText` often differs from `label`.
"sort_text",
-- Sort by `label` field from LSP server, i.e. name in completion menu.
-- Needed to sort results from LSP server by `label`,
-- even though protocol specifies default value of `sortText` is `label`.
"label",
},
For some reason, blink.cmp shows matches that does not start with c, before those that do. That is despite having exact specified.
Moreover, the upper/lower case order is also not considered in the same way as vscode.
PS: I am using LSP vtsls, which should match the one used by vscode.
Any idea what I am doing wrong?
Relevant configuration
sources = {
default = { "lsp", "path", "snippets", "buffer" },
}
neovim version
NVIM v0.12.0-dev-132+gee3f9a1e03
blink.cmp version
main
The exact match is for when you've typed "foo" matching against "foo", not for when you've typed "foo" matching against "foobar".
Making the label sort prioritize lowercase over capitals makes sense to me. It's possible that _C incorrectly receives a double bonus (delimiter and camelCase bonuses, should just receive delimiter bonus) in the fuzzy matcher which is why it's so prominent, but I'll need to look into it
Thanks @Saghen , just let me know if I can help test further.
In addition to the upper/lowercase issue, it seems a bit counter intuitive to show APP_C... before cache, when the user has typed c, even if APP_C.. is local and cache is remote.
In my head, assuming none of the two items are deprecated, then items starting with c should appear before those starting with ..._C... or ..._c..., despite remote/local differences.
PS: It is correct to differentiate on local | remote package, if we go by vscode/browser completions. However, all those starting with what is actually typed should appear first, and within each such group the engine should sort on local vs. remote (among others).
The exact match is for when you've typed "foo" matching against "foo", not for when you've typed "foo" matching against "foobar".
Making the label sort prioritize lowercase over capitals makes sense to me. It's possible that
_Cincorrectly receives a double bonus (delimiter and camelCase bonuses, should just receive delimiter bonus) in the fuzzy matcher which is why it's so prominent, but I'll need to look into it
I am also very confused by the behaviour of exact match
// config
fuzzy = {
sorts = {
'exact',
'score',
'sort_text',
},
},
@jjiangweilan I think we can conclude, as @Saghen said, that the issue is with _<char>, e.g. _A_B in your case, being given too much weight under the hood. It is unrelated to exact.
Yep, working on it in https://github.com/Saghen/frizbee/pull/28
@Saghen Not sure if this is related, but check out this odd sorting (using the default sorts = { 'score', 'sort_text' }):
It starts out with two entries that seem completely misplaced. It should prioritize the ones that start with the search term.
Any idea what is going on here?
Not sure if this is what you are already doing, but perhaps it would be better to just use fzf under the hood.
EDIT:
I tried turning off frequency and proximity, but that did not fix the above.
Then I changed the implementation to lua, and that solved the issue.
@Saghen I have done additional research. It seems both lua and rust give the wrong order, each in their own unique ways:
implementation = "lua":
- Order is non-deterministic. Around 50% of the time, in a TS file, snippets all show up at the top of the completion list.
- Otherwise, this setting works pretty well (it does not have the order issues of
rust, explained below).
The completion items in the last picture above seem wrong. Snippets should not be suggested after a dot.
implementation = "prefer_rust_with_warning":
- Often shows items which do not start on search term, before those that do, even with frequency and proximity turned off.
- Non-deterministic, sometimes the list starts with uppercase entries, other times with lowercase.
Note: I was able to replicate the (incorrect) behavior from implementation = "lua", in implementation = "rust" , by setting sorts = { 'sort_text' }. I.e. it then shows all snippets at the top around 50% of the time the file is opened.
I imagine the order of the sources is non-deterministic because we don't have source priorities (#1098) so whichever one finishes first gets put first in the list. The default snippets.score_offset = -3 would push it below the LSP source. You're not using the score sort so there's no difference between lua and rust implementations in this case.
The built-in table.sort in lua is unstable, so with the score sort enabled, there are still be some cases where the order changes between runs. We would need to make sure we can still handle thousands of items after making that stable
Snippets should not be suggested after a dot.
https://cmp.saghen.dev/recipes.html#hide-snippets-after-trigger-character
The incorrect double bonus on _C was resolved by https://github.com/Saghen/blink.cmp/commit/c3a54218bc799bd497db4fb7132d60b14b31707a
Hi @Saghen ,
Sorry for coming back to this, but is this intended default behavior?
If we start writing an "H", it is likely we want to get an item that starts on "H", not an item with H in the middle.
Just wanted to see if it was a bug or a feature, and if there is some way I can fix it.
Thank you.
If we start writing an "H", it is likely we want to get an item that starts on "H", not an item with H in the middle.
I think it'd be great if blink.cmp.CompletionItem had an offset field for the first character index that got matched to support a sorter like nvim-cmp has here
It's the second sorter I use below exact with nvim-cmp, it's the only think stopping me from switching because matches like this annoy me too much
If we start writing an "H", it is likely we want to get an item that starts on "H", not an item with H in the middle.
I think it'd be great if
blink.cmp.CompletionItemhad anoffsetfield for the first character index that got matched to support a sorter like nvim-cmp has hereIt's the second sorter I use below
exactwith nvim-cmp, it's the only think stopping me from switching because matches like this annoy me too much![]()
This kinda bothers me too. I guess 'exact' means it's also case senstive. But I wish it can be smart case sensitive. There is a relatived feature requst issue https://github.com/Saghen/blink.cmp/issues/1158
Still having this issue after upgrading to 1.2.
https://github.com/Saghen/blink.cmp/issues/1642#issuecomment-2867111308
Looks like vtsls sends the item as .outerHeight and .Highlight which is quite unusual and breaks the prefix bonus (since H ends up being the second character, not the first, in .Highlight). I'd like to workaround this by ignoring non-alphanumeric characters in frizbee when calculating the prefix bonus but it's non trivial.
https://github.com/Saghen/blink.cmp/issues/1642#issuecomment-2868525843 https://github.com/Saghen/blink.cmp/issues/1642#issuecomment-2868903691
Both of these would be solved by a bonus for matching on continuous characters, which is also non-trivial to implement unfortunately. We could increase the gap penalty but that may lead to #1598 showing up again
I got a weird issue: I have a folder /slack and a file /swapfile, and completing /sl in the cmdline selects /swapfile instead of the expected /slack.
The weird part is that it works correctly in a buffer with the path source, and none of these settings make a difference:
fuzzy = {
sorts = { 'exact', 'score', 'sort_text' },
use_frecency = false,
use_proximity = false,
max_typos = function() return 0 end,
},
But with implementation = 'lua' it works correctly, i.e. /slack is completed in the cmdline too.
I've converted to blink.cmp recently and overall thrilled with the speed, reliability, and ability to tweak it. I'm never going back, this module a huge step forward (thank you!).
That said, the matcher is still driving me mildly mad. In short, I strongly want it to use exact prefix matching above all the other fuzzy rules. Most of the time I know what I want and just need it to save keystrokes, not try to be creative.
Two items are important to me:
Item 1: PREFIX_BOUNS
From my experiences over a week, PREFIX_BONUS is way to low for my taste. I ended up monkeypatching this bonus to 12 in my local nvim and that was a game changer for me. I am happy with my patch, but exposing these bonus values would be great.
With this adjustment 99% of the code matches are what I'd consider reasonable.
Item 2: Path matching
I'm still misconfigured or confused how the matching works for paths in cmdline mode.
My relevant settings (although I've tried less complex as well as use frecency/proximity, I'm okay to change them but want to keep with Lua given the moneypatch for item 1):
completion = {
menu = { auto_show = true },
list = {
selection = {
preselect = true,
auto_insert = true,
},
},
},
},
fuzzy = {
implementation = 'lua',
sorts = { 'exact', 'score', 'sort_text' },
use_frecency = false,
use_proximity = false,
max_typos = function()
return 0
end,
},
The behavior that I'm seeing (as best I can tell) is that the needle for the match is using the text from the current directory (i.e. the filename ignoring the base path). At least in lua, it looks like that is being matched with haystack options that are the full paths (so including the base path). Is that the underlying implementation? It feels like these should be consistent.
Here I'd want init.lua as the most likely choice. But the ini matches are occurring on the common base path rather than items in the current directory.
Really appreciate it @millerjason! The path matching issue should be fixed as of https://github.com/Saghen/blink.cmp/commit/5cf9a786622764f4a8b90735c44e12009ae2e9fc
I'm curious about the PREFIX_BONUS change you mentioned, as 12 is the default in frizbee, do you mind elaborating on that? It was however set to 6 before this change
Edit: Sorry, just re-read and realized youre using the Lua matcher
The main issue with the scoring I've noticed is that foo matches with a higher score on f_o_o than on foo. I'm currently using a locally built version with the GAP_OPEN_PENALTY = 5 and DELIMITER/CAPITAL_BONUS = 4 rather than the previous 3 and 8, which prioritizes foo over f_o_o. I'll likely make this the default in the next version.
I've opened a PR that should fix the last issue I mentioned above. Do you mind giving it a shot and seeing if the behavior works better for you? I have to run so I admittedly haven't tested the lua gap penalty change. I'd be particularly interested in knowing which cases were fixed by increasing the prefix bonus on your end.
I've opened a PR that should fix the last issue I mentioned above. Do you mind giving it a shot and seeing if the behavior works better for you? I have to run so I admittedly haven't tested the lua gap penalty change. I'd be particularly interested in knowing which cases were fixed by increasing the prefix bonus on your end.
using the fuzzy-scoring branch, I am getting ../lazy/blink/lua/blink/cmp/fuzzy/lua/match.lua:76: fbb should match barbazfoobarbaz with score 18 assertion error
I've opened a PR that should fix the last issue I mentioned above. Do you mind giving it a shot and seeing if the behavior works better for you? I have to run so I admittedly haven't tested the lua gap penalty change. I'd be particularly interested in knowing which cases were fixed by increasing the prefix bonus on your end.
Thank you, @Saghen.
I am seeing the same error as @jjiangweilan. But even if I comment out all of the assert tests, that branch still exhibits the same undesirable behavior for path matches. Confusingly, if I go the the latest main branch things work much better now. Did you tackle this in 2c3d276 ?
I've converted to
blink.cmprecently and overall thrilled with the speed, reliability, and ability to tweak it. I'm never going back, this module a huge step forward (thank you!).That said, the matcher is still driving me mildly mad. In short, I strongly want it to use exact prefix matching above all the other fuzzy rules. Most of the time I know what I want and just need it to save keystrokes, not try to be creative.
Hi, I'm sorry if I'm just adding noise to this thread but this matches my experience switching to blink.cmp closely.
I'm overall very happy with the performance and ease of configuration,
but the suggestions, especially for the cmdline have been a little counterintuitive out of the box.
As an example when I type in :La<Tab> I get abclear back as the first result, where I would expect Lazy.
I have been able to get it to mostly do what I expect through trial and error now.
Two items are important to me:
Item 1:
PREFIX_BOUNSFrom my experiences over a week,
PREFIX_BONUSis way to low for my taste. I ended up monkeypatching this bonus to 12 in my local nvim and that was a game changer for me. I am happy with my patch, but exposing these bonus values would be great.With this adjustment 99% of the code matches are what I'd consider reasonable.
Item 2: Path matching
I'm still misconfigured or confused how the matching works for paths in cmdline mode.
My relevant settings (although I've tried less complex as well as use frecency/proximity, I'm okay to change them but want to keep with Lua given the moneypatch for item 1):
completion = { menu = { auto_show = true }, list = { selection = { preselect = true, auto_insert = true, }, }, }, }, fuzzy = { implementation = 'lua', sorts = { 'exact', 'score', 'sort_text' }, use_frecency = false, use_proximity = false, max_typos = function() return 0 end, },
This in particular has been very useful to get the completion behavior more in line with how I expected it to work out of the box.
Although I haven't looked into monkeypatching the PREFIX_BONUS yet.
I've converted to
blink.cmprecently and overall thrilled with the speed, reliability, and ability to tweak it. I'm never going back, this module a huge step forward (thank you!). That said, the matcher is still driving me mildly mad. In short, I strongly want it to use exact prefix matching above all the other fuzzy rules. Most of the time I know what I want and just need it to save keystrokes, not try to be creative.
...
This in particular has been very useful to get the completion behavior more in line with how I expected it to work out of the box. Although I haven't looked into monkeypatching the
PREFIX_BONUSyet.
FWIW, with the recently merged fixes I was able to give up on all my edits and monkey patches. Now I'm just using this version (Lua variant) and it works great, even in cmdline mode:
'saghen/blink.cmp',
version = 'v1.*',
branch = 'main',
commit = '2c3d276',
Thanks for all the quick fixes @Saghen !
I don't feel the m should match twice, and certainly the a shouldn't match. Looks like the a is matching instead / as a substitute of the _?
fuzzy = {
implementation = 'prefer_rust_with_warning',
sorts = {
'score',
'sort_text',
},
max_typos = 0,
}
neovim version
NVIM v0.12.0-dev-cf5506f0fd Build type: RelWithDebInfo LuaJIT 2.1.1744317938
blink.cmp version
v1.4.1
I'm wondering if it's possible to disable or configure fuzzy matching conditionally, or more specifically: only in cmdline. I really like the fuzzy matching for cmdline in many cases, but it doesn't super work well for / when there isn't a match for what I'm looking for.
Like I might be doing /logging and then the blink autocomplete menu only has toggling and with my enter key mapped to accept_and_enter, it searches for something different than I originally intended. Like, if I could fuzzy match on prefix only for cmdline that would almost be perfect.
You can follow the progress on prefix matching in #1258 @chancez
Going to close this issue as I believe the big fuzzy matching issues have been fixed by this point. Remaining issues can be in their own issues with the fuzzy label