closestmatch icon indicating copy to clipboard operation
closestmatch copied to clipboard

Results vary on repeated calls for same query string [ClosestN]

Open raghur opened this issue 7 years ago • 4 comments

So I'm trying out your lib for building a fuzzy file matcher... input is 1000 filenames... I built a small python client that calls the server. The first time, the server builds the closestmatch.ClosestMatch structure and reuses it for subsequent calls. Interestingly, when I type in the same query, each time I'm getting 10 different results. Is this how its supposed to work?

Here's the server log:

fuzzy-denite\scratch>go run gopickle.go
INFO[0000] starting
ERRO[0013] /search: Context 12345 does not exist and no data passed
INFO[0013] Creating closestmatch context 12345
INFO[0014] Created new context 12345 of size 1000
INFO[0014] Searching for com in context 12345
INFO[0014] 10 matches for com. Will return max: 10 results
INFO[0018] Searching for com in context 12345
INFO[0018] 10 matches for com. Will return max: 10 results

Here's the client logs:

fuzzy-denite\scratch>python sender.py send closestmatch p1000.dat
['closestmatch', 'p1000.dat']
com
resending with data
200 OK
10
d:\code\go\src\github.com\josharian\impl\LICENSE.txt
d:\code\go\src\github.com\fatih\motion\main.go
d:\code\go\src\golang.org\x\text\LICENSE
d:\code\go\src\github.com\golang\dep\analyzer.go
d:\code\go\src\github.com\kisielk\gotool\go13.go
d:\code\go\src\google.golang.org\api\google-api-go-generator\clients_test.go
d:\code\go\src\github.com\tpng\gopkgs\LICENSE.txt
d:\code\go\src\gopkg.in\urfave\cli.v1\appveyor.yml
d:\code\go\src\github.com\nsf\gocode\scope.go
d:\code\go\pkg\dep\sources\https---github.com-sirupsen-logrus.git\formatter.go
com
200 OK
10
d:\code\go\src\github.com\BurntSushi\toml\COMPATIBLE
d:\code\go\pkg\dep\sources\https---github.com-onsi-gomega\matchers\be_closed_matcher_test.go
d:\code\go\pkg\dep\sources\https---github.com-spf13-cobra\command_notwin.go
d:\code\go\src\google.golang.org\api\examples\gopher.png
d:\code\go\src\github.com\BurntSushi\toml\doc.go
d:\code\go\pkg\dep\sources\https---github.com-sergi-go--diff\APACHE-LICENSE-2.0
d:\code\go\pkg\dep\sources\https---gopkg.in-yaml.v2\yamlh.go
d:\code\go\src\github.com\peterh\liner\output.go
d:\code\go\src\github.com\nsf\gocode\config.go
d:\code\go\src\github.com\sirupsen\logrus\doc.go

Sources and data are here - https://github.com/raghur/fuzzy-denite/tree/closestmatch/scratch

raghur avatar May 08 '18 10:05 raghur

Ok -I went through the code and it uses goroutines to match. However, this seems faulty as shouldn't the ranking be stable? Also, after a few more iterations, I land on the following:

com
200 OK
10
d:\code\go\src\github.com\bytesparadise\libasciidoc\codecov.yml
d:\code\go\bin\megacheck.exe
d:\code\go\tags
...

Those aren't the 'best' ranked matches.. - if you see the second and third result.

raghur avatar May 08 '18 11:05 raghur

@raghur Can you explain what your file logs are for? What is the search list and what is your query?

schollz avatar May 08 '18 20:05 schollz

I'm trying to build an interactive fuzzy file filtering routine to filter file names under a directory as the user types.

In the client logs above, 'com' is the query.. the search list of files is a fixed list of 1000 files that I'm using as a test corpus.

raghur avatar May 09 '18 00:05 raghur

I'm getting the same problem in Closest, where a fixed list and fixed search string return different results between invocations

Evidlo avatar May 18 '18 09:05 Evidlo