Sorting speed for large collections
Sorting is nice but can be slow for large collections. Maybe a threshold could be added to decide when to stop sorting. Another more radical idea I had was to just show the history candidates on first invocation (with default and input element moved to front) and then start sorting after input arrives and the collection already is refined to a smaller size. Having the minibuffer bump up immediately would give this approach a snappy feeling, I guess.
Yeah, a threshold is the most obvious idea but unfortunately I think that's a non-starter by itself, because it's critically important to show the recently selected candidates at the top even before anything has been typed.
Special-casing the first invocation to show the history candidates is I think the most promising approach, but it needs to be done with care so that it's robust and not a gigantic hack. In particular, you can't just show the top of the history, because history elements may no longer actually be candidates. (This will happen all the time in filename completion under Selectrum, for example.) And then you're going to have to actually do the sorting properly at some point. How do you decide when?
I would be very happy to see an implementation proposal that addresses these concerns, as I have been tossing them around for some time without being able to come up with anything clean.
Oh yes, another consideration: you might have to sort the candidates initially anyway, if there aren't enough history elements to fill the minibuffer. Another special case to deal with.
Yeah, file completion has to be special cased, for most other commands it should hopefully work (at least when they define a history variable, describe-function and describe-variable don't which seems like a BUG). We could also just define a history variable on the fly instead of falling back to minibuffer-history. I already do that in my config.
And then you're going to have to actually do the sorting properly at some point. How do you decide when?
I don't know, maybe as soon as not enough history elements matches are left for display. This would also cover the case where there aren't enough history elements to begin with. Then sorting would have to be done right away. When the filtering is fast and we do the sorting afterwards the shrinked size might improve the sorting speed significantly. But I'm just guessing no idea if it will actually work out.
file completion has to be special cased
I would strongly prefer not to do that. Special-casing individual commands usually does not end well in the long run. In this case, it will introduce a number of subtle bugs related to things like commands being presented in M-x that aren't actually runnable because the libraries that define them haven't yet been loaded in the current Emacs session. We should solve the problem in general, so that the behavior of Selectrum remains exactly the same to the end user, just hopefully with better performance.
Special-casing individual commands usually does not end well in the long run. In this case, it will introduce a number of subtle bugs related to things like commands being presented in M-x that aren't actually runnable because the libraries that define them haven't yet been loaded in the current Emacs session. We should solve the problem in general, so that the behavior of Selectrum remains exactly the same to the end user, just hopefully with better performance.
Good point, I don't have a good answer to this.
I also heard that is not an issue for some people, maybe I need to get a faster machine :running_man:
Caching would work for static collections. We could then do insertion sort on them, which should be faster for almost sorted collections but that would have to be done in Elisp which slows it down again.
maybe I need to get a faster machine
This certainly would solve the problem, but I don't think there's a good reason why selecting an item from a list should require a high-performance machine. We should improve the experience for everyone, not just those who can afford high-end hardware.
but that would have to be done in Elisp which slows it down again
Yeah. Turns out that using C functions usually outweighs considerations of algorithmic complexity. Can't wait for gccemacs.
I found that describe-variable is 5x slower than ivy even selectrum-should-sort-p is nil
;; The default selectrum
(benchmark 1 '(call-interactively #'describe-variable))
;;=> "Elapsed time: 0.964549s"
;; selectrum with nil `selectrum-should-sort-p'
(benchmark 1 '(let ((selectrum-should-sort-p nil))
(call-interactively #'describe-variable)))
;;=> "Elapsed time: 0.804663s"
;; ivy-mode
(benchmark 1 '(call-interactively #'describe-variable))
;;=> "Elapsed time: 0.169659s"
@condy0919 Thank you for the numbers!
See also #312. @oantolin, I think this is the slowdown you observed.
:+1: The ultimate response time
@condy0919 Well, I am just an interested contributor here, not one of the maintainers ;)
describe-variable is slower with Selectrum for some reason. describe-function is about equally fast in ivy and selectrum.
Thanks for the benchmark @condy0919! I care about the case where the input is fed programmatically, because that's what Embark does. Selectrum is clearly slower than default completion or icomplete (I ran 10 iterations because there is a lot of variation):
(benchmark 10 '(let ((unread-command-events '(?n ?i ?l 13)))
(call-interactively #'describe-variable)))
"Elapsed time: 17.478012s (1.220557s in 6 GCs)" ; embark-live-occur-after-input
"Elapsed time: 17.716527s (1.272396s in 7 GCs)" ; embark-live-occur-after-delay
"Elapsed time: 17.901300s (1.140958s in 6 GCs)" ; default tab completion
"Elapsed time: 19.553066s (1.297312s in 7 GCs)" ; icomplete-vertical
"Elapsed time: 19.683581s (1.308557s in 7 GCs)" ; icomplete
"Elapsed time: 21.154784s (1.519785s in 8 GCs)" ; ivy
"Elapsed time: 27.062260s (1.590118s in 9 GCs)" ; selectrum
"Elapsed time: 27.597704s (1.710274s in 9 GCs)" ; selectrum-should-sort-p = nil
Note that Embark's completion UIs are essentially just default tab completion plus popping up a buffer after you type a bit or after a delay, so for this benchmark they should be exactly the same as default tab completion, and they are. Also note that for Selectrum is slow whether I ask it to sort or not.
I think it may not be sorting that's slowing down C-h v in Selectrum, but rather the GC.
EDIT: @minad's right, I don't why I said this. I think because I saw it triggered GC more times.
@oantolin but there is not significantly more time spend in gc? Could you also benchmark allocated memory?
Oh, you are right @minad! I think I only said the GC thing because I saw more GCs, but the total time is not much larger. I added ivy to the results I quoted: it's a little slower than icomplete but faster than selectrum, which matches my expectations, I guess.
How do I benchmark allocated memory?
How do I benchmark allocated memory?
Sorry, I don't know. Probably requires running the profiler. Alternatively one can look at the gc statistics output.
But since the GC times are small I don't expect you would find something. It seems selectrum is performing even better since it has only few GCs more but a significantly longer runtime.
The problem is neither sorting nor GC/allocations.
The problem is neither sorting nor GC/allocations.
Right, and it seems to be specific to describe-variable. I tried a similar thing with execute-extended-command all the completion UIs I tried above (including now ivy) take basically the same amount of time!
(benchmark 10 '(let ((unread-command-events '(?p ?w ?d 13)))
(call-interactively #'execute-extended-command)))
"Elapsed time: 22.527526s (0.607186s in 3 GCs)" ; default tab completion
"Elapsed time: 23.188627s (0.873425s in 4 GCs)" ; ivy
"Elapsed time: 23.383487s (0.667524s in 3 GCs)" ; icomplete
"Elapsed time: 23.775664s (1.031111s in 5 GCs)" ; selectrum
~From the GCs it looks as if selectrum is producing more garbage. Which is cheap such that the runtime doesn't go up significantly. But more GCs are needed.~ (this is probably not correct for the emacs ms gc)
Did you also look at profiles? Maybe that would help? But I guess @clemera or @raxod502 also profiled selectrum many times...
I did not profile Selectrum until now, I was mostly busy with working on the general UI and such. I don't know where the slowdown for describe-variable is coming from so this might be the time to do finally do some profiling :)
@clemera I did some profiling when I wrote consult-line. selectrum did well there. Most of the time is spent in consult--line-candidates, but that was okay after I did some basic optimizations.
But would be interesting to profile describe-variable, since this seems to hit some edge case.
Yes, after some digging I found the reason is the predicate passed by describe-variable which does switch the current buffer for each candidate. In Selectrum we compute the candidates in post command hook when the minibuffer is already the current buffer so the buffer needs to be adjusted for each candidate, Ivy does this before the minibuffer is entered so the buffer can stay the same.
The reason we compute them late is to behave more like default minibuffer completion, for example with embark you wouldn't want to compute the candidates after injection.
Is there a chance to get this fixed? You are running everything in the temporary buffer context afaik. We talked about this before where I wondered about the context of the annotations l functions I believe. It should be such that selectrum can efficiently add to the temporary buffer but the functions passed to completing-read should be executed in the proper context.
You can still do the late computation but just put the loop in the right context if that's possible?
You are running everything in the temporary buffer context afaik.
Only when computing the display of candidates the computation of the candidates happens with the minibuffer current like with default completion.
You can still do the late computation but just put the loop in the right context if that's possible?
I think we would need to special case help--symbol-completion-table because usually the right context should be the minibuffer.
I think we would need to special case help--symbol-completion-table because usually the right context should be the minibuffer.
Are you sure? I think you are currently executing in the temporary buffer? So I would change it to the original buffer if this is what the default completion does.