mycli icon indicating copy to clipboard operation
mycli copied to clipboard

Add better sorting of completions

Open schmeic opened this issue 1 year ago • 30 comments

Description

Sort completions by match point, then match length, then item text.

Note that match length is equal to the length of the search text for non fuzzy matches, and equal to the length of the match group for fuzzy matches.

This means that matches that start with the search text will be displayed first, followed by matches that contain the search text, and then fuzzy matches.

Sorting by item text when both match point and match length are the same for matched completions ensures that completions that start with the search text will be in alphabetical order. For example, searching for fr will display FROM before FROM_UNIXTIME.

Checklist

  • [ ] I've added this contribution to the changelog.md.
  • [x] I've added my name to the AUTHORS file (or it's already there).

schmeic avatar Jun 13 '24 01:06 schmeic

Wow, you weren't joking about the tests.

From the English description, the algorithm sounds great.

I may have time to experiment with this on the weekend.

rolandwalker avatar Jun 13 '24 12:06 rolandwalker

Yeah, that's why I wanted to make sure the algorithm would be accepted before working on the tests. From my experiments with this new sorting, I've found that it has always put what I consider the most relevant results at the top.

The funny thing was that the find_matches(...) function was already storing all 3 of the parameters I used for sorting, and then ignoring everything but the item text at the end when creating the completions generator. That's what gave me the idea for the sorting algorithm.

schmeic avatar Jun 13 '24 18:06 schmeic

Let's let the PR hang around for some days and see if it attracts any objections.

rolandwalker avatar Jun 13 '24 19:06 rolandwalker

Ok. Maybe you'll have some time to experiment with it as well. I'm hoping it will attract some "likes"...

schmeic avatar Jun 13 '24 19:06 schmeic

Should the proposed change only apply when if not smart_completion: ? It would see that the point of smart_completion is to give categories in category order.

rolandwalker avatar Jun 13 '24 21:06 rolandwalker

I don't think so, I personally would rather have a completion at the top that best matches my search text regardless of the category. Have to tried it out yet? Maybe you will agree after some experimenting.

I actually first tried doing the same sort in the find_matches(...) function, so per category, and the results were not ideal.

schmeic avatar Jun 13 '24 21:06 schmeic

I'm trying it now. It would seem that the config file lies when it claims

# Enables context sensitive auto-completion. If this is disabled then all
# possible completions will be listed.
smart_completion = True

because when set to False, table names and such are simply not returned.

It is rather helpful to see the completions in the "smart" ordering when nothing has yet been typed, like maybe

            if len(word_before_cursor) != 0 and word_before_cursor[-1] != '(':
                matches = sorted(matches, key=lambda m: (m[1], m[0], m[2].lower().strip('`')))

rolandwalker avatar Jun 13 '24 22:06 rolandwalker

Regarding the smart_completion = False setting, I think what you're seeing is a bug. You need to first do:

use <db name>

and then you should see table and column names. It doesn't work if you use a DSN to select the db.

By the way, I think it would be really useful to do fuzzy matching when not doing smart completion. Maybe this can be implemented as a setting?

schmeic avatar Jun 13 '24 22:06 schmeic

Why closed?

rolandwalker avatar Jun 13 '24 23:06 rolandwalker

Sorry, didn't realize I did that. I deleted a comment and maybe that closed it?

How can I reopen?

schmeic avatar Jun 13 '24 23:06 schmeic

Accidentally closed.

schmeic avatar Jun 13 '24 23:06 schmeic

Regarding the smart_completion = False setting, I think what you're seeing is a bug. You need to first do: use and then you should see table and column names. It doesn't work if you use a DSN to select the db.

Your suggestion works! But we really need to fix that bug.

By the way, I think it would be really useful to do fuzzy matching when not doing smart completion. Maybe this can be implemented as a setting?

Yes, though one issue is that I adore settings, and @amjith really does not. For example for the entire functionality in this PR, I think it would be fantastic if ~/.myclirc added

completion_aggresive_sort = True

or any other property name that makes sense.

I personally would rather have a completion at the top that best matches my search text regardless of the category.

Right. I personally would also prefer to use your algorithm. But we have to think of the most general usecase and the default expectation. That is why I proposed to enable it outside of the smart_completion codepath.

rolandwalker avatar Jun 14 '24 11:06 rolandwalker

Let me catch up on the PR.

I'm not opposed to adding a config option but if the new completion behavior feels logical we can switch to it by default. Let me read through the proposal and try it out locally.

amjith avatar Jun 14 '24 14:06 amjith

I think this new sort is much better than the current default, so it seems like most people would prefer it. By the way, I just upgraded to the latest pip version of 1.27.2, and the sorting is now much worse than before. Now it appears to be returning results in alphabetical order, where as before the order was more by relevance (previous version I used was 1.26.1).

So now in 1.27.2, when I do this: SELECT * FROM e I see a bunch of tables starting with a at the top of the results. Previously, the top results were tables starting with e.

Also, there are other open issues I've seen where people complain about the fuzzy matching and want an option to turn it off because the fuzzy matches are often at the top of the results. My algorithm will also fix this since the fuzzy matches will always be below exact matches.

schmeic avatar Jun 14 '24 17:06 schmeic

I'm still testing things out. Notes so far, I like that FROM is listed about FROM_UNIXTIME. There is a crash that happens when I type SOURCE . Haven't dug into it yet, if you have the time please take a look.

amjith avatar Jun 15 '24 21:06 amjith

I'm still testing things out. Notes so far, I like that FROM is listed about FROM_UNIXTIME. There is a crash that happens when I type SOURCE . Haven't dug into it yet, if you have the time please take a look.

Ok, I see the issue. The find_files() method is returning a Completion generator and the code is expecting an array of match tuples.

Any idea what happened to the sorting in the latest version (1.27.1) ? Why is it now alphabetical?

schmeic avatar Jun 17 '24 18:06 schmeic

Still waiting on feedback for this one...

schmeic avatar Jun 27 '24 19:06 schmeic

I've tested it locally and I'm fine with the ordering. Happy to merge if you can fix the tests.

amjith avatar Jun 27 '24 19:06 amjith

Cool, I'll work on fixing the tests.

schmeic avatar Jun 27 '24 19:06 schmeic

Is there a way to run tests individually? I'm having a bit of trouble understanding the test setup.

schmeic avatar Jul 09 '24 21:07 schmeic

We just use pytest. You can run pytest -k <name of the test> to run just a single test.

For example:

pytest -k test_suggested_multiple_column_names_with_dot will run only that one single test.

You can also specify partial names to select multiple tests.

pytest -k test_suggested will select all the tests that start with that name.

amjith avatar Jul 09 '24 23:07 amjith

Cool, thanks! I don't use python that much, so this is all pretty new to me. First test fixed...

schmeic avatar Jul 10 '24 00:07 schmeic

I've fixed all of the unit tests, but when I pushed my changes, I get an email saying the "PR run failed ...".

How do I fix this?

schmeic avatar Aug 07 '24 22:08 schmeic

Looks like the error is related to the test command, which is not directly related to this pull request.

"The test command is disabled and references to it are deprecated."

What's the status of this pull request?

schmeic avatar Aug 19 '24 20:08 schmeic

What happens if you change the line setuptools in requirements-dev.txt to setuptools<=71.1.0 ?

rolandwalker avatar Aug 19 '24 21:08 rolandwalker

Better, but why are most of these checks getting cancelled? Also, the one that failed had no errors:

8 features passed, 0 failed, 0 skipped 32 scenarios passed, 0 failed, 0 skipped 139 steps passed, 0 failed, 0 skipped, 0 undefined Took 0m16.305s Error: Process completed with exit code 1.

schmeic avatar Aug 20 '24 17:08 schmeic

First pytest is run, then behave is run. Then the CI run exits with a failure if either one failed. In this case, pytest is showing failures:

test/test_clistyle.py ss                                                 [  0%]
test/test_completion_engine.py ......................................... [ 15%]
...................x...................................................  [ 41%]
test/test_completion_refresher.py ....                                   [ 42%]
test/test_config.py ..............                                       [ 47%]
test/test_dbspecial.py ....                                              [ 49%]
test/test_main.py ...................                                    [ 56%]
test/test_naive_completion.py .....                                      [ 57%]
test/test_parseutils.py .............X........................           [ 71%]
test/test_prompt_utils.py .                                              [ 72%]
test/test_smart_completion_public_schema_only.py .................F....  [ 80%]
test/test_special_iocommands.py ...................                      [ 86%]
test/test_sqlexecute.py ...................................              [ 99%]
test/test_tabular_output.py F                                            [100%]

I believe the other CI runs in the matrix were canceled as soon as one failed.

rolandwalker avatar Aug 20 '24 21:08 rolandwalker

I realized that the reason one of the tests is failing is because my sort algorithm is comparing lower cased matches, which doesn't work for the special commands starting with \, since these are case sensitive. How should I handle this?

I feel like there shouldn't even be empty string completion, especially since it isn't really empty string, but rather blank string and requires entering something like a space to trigger it.

It's also strange that when I run ./setup.py test I get an error, but when I run the following command, it passes: pytest -vv -k test_empty_string_completion test/test_smart_completion_public_schema_only.py

schmeic avatar Aug 21 '24 00:08 schmeic

Still waiting to hear back on how to handle this one failing unit test. Doesn't seem worth complicating the sort method just to support these special commands, and fix a single unit test.

schmeic avatar Sep 12 '24 19:09 schmeic

Pretty disappointed not to get any feedback on this, I spent a lot of time trying to fix all these unit tests and I think it's really close.

I wanted to contribute to this project, but I guess I can just use my own branch.

schmeic avatar Oct 09 '24 23:10 schmeic