Add better sorting of completions
Description
Sort completions by match point, then match length, then item text.
Note that match length is equal to the length of the search text for non fuzzy matches, and equal to the length of the match group for fuzzy matches.
This means that matches that start with the search text will be displayed first, followed by matches that contain the search text, and then fuzzy matches.
Sorting by item text when both match point and match length are the same for matched completions ensures that completions that start with the search text will be in alphabetical order. For example, searching for fr will display FROM before FROM_UNIXTIME.
Checklist
- [ ] I've added this contribution to the
changelog.md. - [x] I've added my name to the
AUTHORSfile (or it's already there).
Wow, you weren't joking about the tests.
From the English description, the algorithm sounds great.
I may have time to experiment with this on the weekend.
Yeah, that's why I wanted to make sure the algorithm would be accepted before working on the tests. From my experiments with this new sorting, I've found that it has always put what I consider the most relevant results at the top.
The funny thing was that the find_matches(...) function was already storing all 3 of the parameters I used for sorting, and then ignoring everything but the item text at the end when creating the completions generator. That's what gave me the idea for the sorting algorithm.
Let's let the PR hang around for some days and see if it attracts any objections.
Ok. Maybe you'll have some time to experiment with it as well. I'm hoping it will attract some "likes"...
Should the proposed change only apply when if not smart_completion: ? It would see that the point of smart_completion is to give categories in category order.
I don't think so, I personally would rather have a completion at the top that best matches my search text regardless of the category. Have to tried it out yet? Maybe you will agree after some experimenting.
I actually first tried doing the same sort in the find_matches(...) function, so per category, and the results were not ideal.
I'm trying it now. It would seem that the config file lies when it claims
# Enables context sensitive auto-completion. If this is disabled then all
# possible completions will be listed.
smart_completion = True
because when set to False, table names and such are simply not returned.
It is rather helpful to see the completions in the "smart" ordering when nothing has yet been typed, like maybe
if len(word_before_cursor) != 0 and word_before_cursor[-1] != '(':
matches = sorted(matches, key=lambda m: (m[1], m[0], m[2].lower().strip('`')))
Regarding the smart_completion = False setting, I think what you're seeing is a bug. You need to first do:
use <db name>
and then you should see table and column names. It doesn't work if you use a DSN to select the db.
By the way, I think it would be really useful to do fuzzy matching when not doing smart completion. Maybe this can be implemented as a setting?
Why closed?
Sorry, didn't realize I did that. I deleted a comment and maybe that closed it?
How can I reopen?
Accidentally closed.
Regarding the smart_completion = False setting, I think what you're seeing is a bug. You need to first do: use
and then you should see table and column names. It doesn't work if you use a DSN to select the db.
Your suggestion works! But we really need to fix that bug.
By the way, I think it would be really useful to do fuzzy matching when not doing smart completion. Maybe this can be implemented as a setting?
Yes, though one issue is that I adore settings, and @amjith really does not. For example for the entire functionality in this PR, I think it would be fantastic if ~/.myclirc added
completion_aggresive_sort = True
or any other property name that makes sense.
I personally would rather have a completion at the top that best matches my search text regardless of the category.
Right. I personally would also prefer to use your algorithm. But we have to think of the most general usecase and the default expectation. That is why I proposed to enable it outside of the smart_completion codepath.
Let me catch up on the PR.
I'm not opposed to adding a config option but if the new completion behavior feels logical we can switch to it by default. Let me read through the proposal and try it out locally.
I think this new sort is much better than the current default, so it seems like most people would prefer it. By the way, I just upgraded to the latest pip version of 1.27.2, and the sorting is now much worse than before. Now it appears to be returning results in alphabetical order, where as before the order was more by relevance (previous version I used was 1.26.1).
So now in 1.27.2, when I do this:
SELECT * FROM e
I see a bunch of tables starting with a at the top of the results. Previously, the top results were tables starting with e.
Also, there are other open issues I've seen where people complain about the fuzzy matching and want an option to turn it off because the fuzzy matches are often at the top of the results. My algorithm will also fix this since the fuzzy matches will always be below exact matches.
I'm still testing things out. Notes so far, I like that FROM is listed about FROM_UNIXTIME. There is a crash that happens when I type SOURCE . Haven't dug into it yet, if you have the time please take a look.
I'm still testing things out. Notes so far, I like that
FROMis listed aboutFROM_UNIXTIME. There is a crash that happens when I typeSOURCE. Haven't dug into it yet, if you have the time please take a look.
Ok, I see the issue. The find_files() method is returning a Completion generator and the code is expecting an array of match tuples.
Any idea what happened to the sorting in the latest version (1.27.1) ? Why is it now alphabetical?
Still waiting on feedback for this one...
I've tested it locally and I'm fine with the ordering. Happy to merge if you can fix the tests.
Cool, I'll work on fixing the tests.
Is there a way to run tests individually? I'm having a bit of trouble understanding the test setup.
We just use pytest. You can run pytest -k <name of the test> to run just a single test.
For example:
pytest -k test_suggested_multiple_column_names_with_dot will run only that one single test.
You can also specify partial names to select multiple tests.
pytest -k test_suggested will select all the tests that start with that name.
Cool, thanks! I don't use python that much, so this is all pretty new to me. First test fixed...
I've fixed all of the unit tests, but when I pushed my changes, I get an email saying the "PR run failed ...".
How do I fix this?
Looks like the error is related to the test command, which is not directly related to this pull request.
"The test command is disabled and references to it are deprecated."
What's the status of this pull request?
What happens if you change the line setuptools in requirements-dev.txt to setuptools<=71.1.0 ?
Better, but why are most of these checks getting cancelled? Also, the one that failed had no errors:
8 features passed, 0 failed, 0 skipped 32 scenarios passed, 0 failed, 0 skipped 139 steps passed, 0 failed, 0 skipped, 0 undefined Took 0m16.305s Error: Process completed with exit code 1.
First pytest is run, then behave is run. Then the CI run exits with a failure if either one failed. In this case, pytest is showing failures:
test/test_clistyle.py ss [ 0%]
test/test_completion_engine.py ......................................... [ 15%]
...................x................................................... [ 41%]
test/test_completion_refresher.py .... [ 42%]
test/test_config.py .............. [ 47%]
test/test_dbspecial.py .... [ 49%]
test/test_main.py ................... [ 56%]
test/test_naive_completion.py ..... [ 57%]
test/test_parseutils.py .............X........................ [ 71%]
test/test_prompt_utils.py . [ 72%]
test/test_smart_completion_public_schema_only.py .................F.... [ 80%]
test/test_special_iocommands.py ................... [ 86%]
test/test_sqlexecute.py ................................... [ 99%]
test/test_tabular_output.py F [100%]
I believe the other CI runs in the matrix were canceled as soon as one failed.
I realized that the reason one of the tests is failing is because my sort algorithm is comparing lower cased matches, which doesn't work for the special commands starting with \, since these are case sensitive. How should I handle this?
I feel like there shouldn't even be empty string completion, especially since it isn't really empty string, but rather blank string and requires entering something like a space to trigger it.
It's also strange that when I run ./setup.py test I get an error, but when I run the following command, it passes:
pytest -vv -k test_empty_string_completion test/test_smart_completion_public_schema_only.py
Still waiting to hear back on how to handle this one failing unit test. Doesn't seem worth complicating the sort method just to support these special commands, and fix a single unit test.
Pretty disappointed not to get any feedback on this, I spent a lot of time trying to fix all these unit tests and I think it's really close.
I wanted to contribute to this project, but I guess I can just use my own branch.