telescope-frecency.nvim icon indicating copy to clipboard operation
telescope-frecency.nvim copied to clipboard

[feature] Can we use both fuzzy matching & frecency scoring at the same time?

Open delphinus opened this issue 7 months ago • 18 comments

From #163 & some issues, there are some demand that users want to use fuzzy matching for frecency results. The current build allows to use sorters.get_substr_matcher only. That is because it is the only “sorter” that does not sort but match candidates.

If a “sorter” that fuzzy-matches and does not sort exists, it is maybe useful for them. I want to create it on trial.

delphinus avatar Jan 06 '24 11:01 delphinus

I added naive implementation for fuzzy-matching in #166. It simply converts prompt string, such as foobar, into regexp: f.*o.*o.*b.*a.*r to match. I tested it, but it matches too much candidates to select. I want opinions from other users.

delphinus avatar Jan 15 '24 12:01 delphinus

I've just discovered this project and super-grateful for your effort!

The first thing I stumbled across was not matching stuff with fuzzy searches. By now, it's deeply ingrained in my brain to just type 2-4-letter pieces to find the file (serpy for server.py) which works fantastically well with built-in searches.

This would be a single most impactful improvement for me.

kirillrogovoy avatar Apr 15 '24 11:04 kirillrogovoy

I see. I tried #166 and didn't feel much useful. But it is a good idea that users can select “fuzzy” or “unfuzzy” matcher in config. I will implement that.

-- such as this way?
telescope.setup {
  extensions = {
    frecency = {
      matcher = "fuzzy" -- acceptable: "default" / "fuzzy"
    },
  },
}

delphinus avatar Apr 16 '24 06:04 delphinus

Now I'm testing a matcher: fuzzy in #166. This matcher only matches the pattern to basename only. I also created a matcher: fuzzy_full, to match full paths, but this matches too much candidates to narrow ones to select, I think.

I will use #166 for dog-fooding for a while.

delphinus avatar Apr 16 '24 12:04 delphinus

Question - would this feature solve what I'm looking for below?

I have a directory like:

src/lib/sanity.ts
src/lib/types.ts
image

I'd like to be able to type libsan to match src/lib/sanity.ts, but currently that does not work: image

Instead, I need to include the /, i.e. lib/san:

image

This example is simple - but my project is quite large and i'm not always going to remember the full path nor want to type it out. For instance, typing comsch would ideally match src/components/ui/forms/schedule.tsx

good-idea avatar Apr 24 '24 17:04 good-idea

Yes. fuzzy_full matcher makes good for your use case. But this has a side effect, that is, it matches against candidates you probably have not intended.

# your input “libsan” matches below.
/path/to/www/src/lib/sanity.ts
# also this matches: `Lib……S……A……N` => “libsan”
/Users/foo/Library/Some/Arbitrary/Nice/project.ts

This is why I am hesitating to include this feature into frecency. There is two ways to reduce this inconvenience.

  1. Limit candidates to a certain project by workspace feature.
    • Do :Telescope frecency workspace=Foo makes it show candidates only in Foo project.
    • You can always do this with default_workspace option. For example, default_workspace = "CWD" makes you always see candadidates from your current working directory.
  2. Use fuzzy matcher instead of fuzzy_full.
    • This only matches your input against the basename of paths.

Do you want to use this feature even with these points?

delphinus avatar Apr 25 '24 01:04 delphinus

Just a few thoughts:

  1. Some/Arbitrary/Nice/project.ts is maybe not so arbitrary if it has a frecency score, right? :)
  2. The fuzzy matcher isn't changing the sorting, right? So the most frecent file will still be closer to the cursor? So even though some random undesired files will match, it might not be a problem in practice.
  3. As the escape hatch, maybe you could expose a setting to override the scoring function fn (frecency_score, fuzzy_score) -> final_score which essentially redefines sorting. For those who want more control over how much each score affects the final sequence of files.

kirillrogovoy avatar Apr 25 '24 08:04 kirillrogovoy

  1. ~Yes.~ No. It probably is not arbitrary, that is, you should have opened the file many times in such case.
  2. ~Also yes.~ No, it doesn’t change the order with the current implementaion. In my env, it makes too much noises to select one, but this is because my DB is so huge, maybe.
  3. It seems nice. fuzzy_full will match well for almost users, and some users that has huge DB, like me, should use the func. I will try to add that.

edited.

Sorry, I’ve mistaken yes/no at the first time because I use yes against negative questions when I agree in my native language, Japanese.;)

delphinus avatar Apr 25 '24 09:04 delphinus

Do you want to use this feature even with these points?

  1. Some/Arbitrary/Nice/project.ts is maybe not so arbitrary if it has a frecency score, right? :)

Yes - that would be fine. Plus, once I actually navigate to lib/sanity a couple of times, it will rank higher on the frecency score anyways - so /Library/Some/Arbitrary/Nice/project.ts showing up in the results wouldn't be an issue.

good-idea avatar Apr 25 '24 23:04 good-idea

I've updated #166. The current logic calculates scores with recency and fzy implemented by telescope.algos.fzy. It seems to work good for me. Still testing.

delphinus avatar Apr 26 '24 13:04 delphinus

It works well in my env. I merged #166 and added matcher = "fuzzy". Do not hesitate to reopen here when you all want to discuss further.

delphinus avatar Apr 28 '24 09:04 delphinus

Awesome, thanks a lot @delphinus!

For the reference, here's the config that does what I was looking for:

frecency = {
  matcher = "fuzzy",
  scoring_function = function(recency, fzy_score)
    return -recency
  end
}

It seems to only filter the list while the order still depends ONLY on the recency score. So now I basically use either of two file searches: fuzzy only (default telescope find_files with rg in my case) or frecency only.

I found that mixing the two leads to unpredictable sorting on every character I type (at least in my experience, ymmv).

Will keep it like that for a week to see if there's more to it. 🙌

kirillrogovoy avatar Apr 29 '24 07:04 kirillrogovoy

@kirillrogovoy I also tried your suggested function. I noticed it is difficult to select a candidate which has too small recency score, such as, I have opened it only once.

For example, consider a case like this.

  1. I want to open /path/to/hello.js that has recency score: 10 because I only opened this only once.
  2. DB has a file /path/to/high/element/low/foo.js that has much score: 1000.
  3. When I input hel, /path/to/high/element/low/foo.js is higher than /path/to/hello.js …… This is not intended, probably.
  4. When I input hello.j, then /path/to/hello.js is the first now.

But, it might be a matter of taste. I want to test own logic by many users.

delphinus avatar Apr 30 '24 02:04 delphinus

I reopen this just to be sure.

delphinus avatar Apr 30 '24 02:04 delphinus

Thanks for sharing! I think it's definitely a matter of taste or just preferred use cases.

I use the frecency search only to navigate among the last ~5-20 opened files while working on a task. When the file is not frecently visited, I look it up with the standard find_files. The biggest time saver here for me is that I can just type something like "controller" and have the most related controller to my current task as the first/second suggested option.

Some people handle it differently: e.g. they have a dozen of buffers open with the files they need and they use the buffer search to navigate. For me, the frecency search is like that, but without the need to manage those buffers. :)

This tactics definitely doesn't work if you intend to use the frecency search most of the time in place of find_files exactly for the reasons you've mentioned. 👍

kirillrogovoy avatar Apr 30 '24 07:04 kirillrogovoy

Fuzzy works great for me, thank you!

good-idea avatar May 13 '24 21:05 good-idea