autocomplete-plus
autocomplete-plus copied to clipboard
Prefer most used words/suggestions
The completion seems very static. No matter how often I use/complete a function or a variable the completion suggestions are always the same. The location bonus works sometimes, but is often overshadowed by functions suggestions that I have never used (or want to for that matter).
Here is a silly example where I would have liked the order to be different
@tepf Have you tried enabling the Alternate Scoring option in the autocomplete-plus settings?
@50Wliu I tried yes. But it didn't change. Ironically GitHub assumed correctly that I wanted to reference you with the '@' and put you at the top.
/cc @jeancroy
Yeah both for autocomplete plus and fuzzy finder there's request for more learning. I've seen in in command pallette too. I'm not sure how to best do that, I'd need integration from the package to register user selection and possibly manage persistence / context.
As far as static goes this is the correct option, one has to prefer shorter entry because the only way to refine is to add characters.
I'll also note that autocomplete plus has preference for entry that are near. But here handy and handler are one line difference so it does not kick in.
@tepf what kind of spec would you imagine about your suggestion ? Because at one point we have to suggest entry that are worst, but still more frequent / recent, the tradeoff are not clear to me.
Scoped by project ? The way I imagine would be recently typed, across all file, not necessarily aware of the line before as your screenshot.
Well the location bonus is a good start (although it needs a refinement).
But how about counting how many times a completion suggestion has been picked? For example if I type line
and I have completed it to linearEquation
multiple times but never to getLine
then I would like it to prefer the one I have completed more often.
I assume reading the file is not an option because of performance, but that would help as well to get an idea which words are more used and are going to be used more in the future.
And ultimately, of course a kind of semantic completion. For example don't complete to variables when it would be syntactically wrong, rather prefer other kind of suggestions that fit. But that it is out of hands I guess.
But how about counting how many times a completion suggestion has been picked? For example if I type line and I have completed it to linearEquation multiple times but never to getLine then I would like it to prefer the one I have completed more often.
That's a start.
- Now how often is clearly more often ?
- What advantage does this bring vs typing
lineq
? You save one keystroke. But what if the history thing cost you one keystroke elsewhere ? - Is
linearEquation
something popular in the whole project or just a preference scoped tolin
? - How often are you switching interest / feature ? How fast should we forget/recycle old popularity ?
- What match quality metric are you willing to give up for popularity ? You suggest overall length. Can we go farther ? Case sensitivity ? Character that are together vs spread ?
- Should it only happens when there's very few character typed ? Or even when there's a lot typed ? (where we can assume you're trying to be specific). What would be a good threshold for
a few
? - I'm thinking the fewer entry that are boosted, the safer it would be to give them strong bonus. Do you have any take on the number of entry we should keep as frequent for the feature to work well?
Now how often is clearly more often ?
I would go about this relatively. For example line
(1 time) and linearEquation
(2 times) equals to 2x as popular. And if a certain threshold is reached then it will be preferred (over other criteria). Maybe you could run this through some probability function. For example if both values are low then give lower importance to them, but the higher the more important. exp(x) could model something like this.
But generally you can assume that if a word has been used recently it is much more likely to be used again. For example if I just declared myVar
then you can almost certainly say it will be used again within the next 10-15 words I type. If it hasn't appeared by then you can discard it.
What advantage does this bring vs typing
lineq
?
In this case you are right, I could write a short form of it. But assume there is a function getline
and your variable is called line
. There is no way for me to circumnavigate this other than using a different variable name. And right now functions are always prefered to variables for some reason.
Is linearEquation something popular in the whole project or just a preference scoped to
lin
?
Not sure what you mean by that? It is for example a function I just defined and am going to use to build lets say the next 2 functions.
How often are you switching interest / feature ? How fast should we forget/recycle old popularity ?
Ideally when you move out of scope (but I don't know if that is in the realm of possibilities for this project). To another function or file. Or when the word hasn't been used the last 10-20 completions.
What match quality metric are you willing to give up for popularity ? You suggest overall length. Can we go farther ? Case sensitivity ? Character that are together vs spread ?
Well I would say case sensitivity usually beats all, because you don't just type something uppercase when all the previous letters have been lowercase. But other than that I can't make any recommendation out of the top of my head.
Should it only happens when there's very few character typed ? Or even when there's a lot typed ? (where we can assume you're trying to be specific). What would be a good threshold for a few ?
Yes the longer the word is the more you can give back control to the user. So you gradually take remove importance. Something like c/x could model this where "x" is the word length and "c" some constant.
I'm thinking the fewer entry that are boosted, the safer it would be to give them strong bonus. Do you have any take on the number of entry we should keep as frequent for the feature to work well?
Obviously you need to strike a good balance. But right now it appears to be that none have gotten that boost.
Finally, Obviously you need some experience in statistics to nail this, which I don't have (barely any). So all my proposals are merely based on intuition.
I would like to see this implemented as well. To me it would be enough to keep a list of used suggestions and those used put on the top of the list. The simplest algorithm could be described with following bash code:
last_used() {
local last="$1"
used_completions="$last"$'\n'"$(echo "$used_completions" | grep -v "^$last\$")"
}
the suggestions would be then populated by echo "$used_completions" | grep "^$current_search_pattern"
a bit better solution would be somethinkg like
last_used() {
local last="$1"
let used_completions[$last]++
}
the suggestions here would be selected by matching keys and then sorted by their values.
the $used_completions
would be save and updated regularly to survive rester of Atom.