evil
evil copied to clipboard
Handle composed characters in evil-range
This is a low-level fix for the composed characters problem (see #1656). The idea is to prevent evil-range
from returning a range that lands inside a sequence of composed characters. I think this makes sense, because evil-range
already takes care to return a valid range through calling evil-normalize-position
, and having the range (beg end)
start or end inside a sequence of hidden characters, seems like it should be considered invalid.
If this seems too aggressive, I could see hiding this feature behind a flag.
Also added some tests.
Looks fine to me, although I don't know much about composition. The tests look good, but maybe something using evil-test-buffer
which fails without this and succeeds with it would illustrate why ending up inside a sequence of hidden characters is bad. My assumption is that selecting and e.g. deleting halfway through a composition would leave one of the components behind? And although I don't much about composition, I can see that wouldn't be desired behaviour.
Sure, I can add more tests.
FWIW, here's the basic idea. If you have a buffer with text like
\alpha
you can use prettify-symbols-mode
which uses composition behind the scenes to visually make the string \alpha
appear as
α
anywhere in the buffer. However, the change is only cosmetic and prior to this evil only "saw" and acted on the string \alpha
(up to internal adjustments made by emacs). If you, for example, put the cursor in the first position and hit x
in normal mode, evil would delete just the \
character, leaving
alpha
because after hitting x
evil calculates a range of characters to delete and since it does not know about composition it just expands the range one character to the right and deletes the single \
.
After this PR, evil-range
will not stop inside the \alpha
string because it is composed, and it will expand the range all the way to the end of \alpha
deleting the entire composed character.
By the way, it is a little presumptuous to assume that the user wanted x
to delete all of \alpha
here, but I think it makes the most sense.
Thanks for that great explanation. When I'm in emacs-state (or for that matter, insert-state, which doesn't map delete differently) and press delete before \alpha
I get alpha
so presumably emacs devs have decided the current behaviour is desirable? What I'm getting at here is: is this an evil feature, or is it a package which allows easier manipulation of composed text? Does vim have an equivalent of composition and handle it more like this PR? BTW I could still imagine this being behind a feature flag if the answer to those questions is no. I can see that it could be helpful :)
Well... it's complicated. I brought up x
because I think it's the harder one to swallow. If you try to use a region to delete \
in emacs state, it won't let you select just the \
character because emacs always moves the point outside of these composed sequences in the command loop. In other words, emacs never lets you move point inside one of these sequences. So in that sense, emacs treats them as single units. However, delete-char
does not treat the whole character sequence as a single unit, like you saw. On the evil side, everything is kind of a region behind the scenes. Even when you ask to delete a character with x
it creates something analogous to a region behind the scenes, causing this change to evil-range
to affect even x
.
We could try to stay more faithful to the emacs behavior and only make the required changes for regions, but I'm not sure if that is the right answer here. I found this plugin for vim, but it opens up all of the composed characters on the line, so I can't check the behavior there.
I personally like the behavior in this PR, because it lets me pretend that all of these latex macros are just single unicode characters, but hiding the behavior behind a flag would be fine with me. One issue with the flag is that it's going to be hard to make it obvious what it means to an average user.
Cool - all understood. I think flags which are hard to understand to the average user are fine as long as they are off by default and have a good doc-string.
I just pulled your branch and turned on prettify-symbols-mode and noticed that when I press backspace in insert mode after a λ
(which is pretified lambda
in elisp files) it sometimes ends up looking like lamb|a
(where |
is the cursor). To repro: If I do vy
on a lambda, and paste elsewhere, the cursor ends up after it (whereas usually the cursor would end up on it). If I press i
to enter insert mode and then backspace, I end up in that situation. If I press h
instead (in normal mode) the cursor doesn't appear to move until I've pressed it the number of letters in the actual word. None of these things happen in master currently. I hate to be a downer, but I think these things need fixing before we merge (and probably a fair bit more manual playing around, plus tests). lmk what you think.
Thanks for checking. I'm happy to tweak or improve as necessary. I also think hiding it behind a flag makes sense for now. I did try to reproduce what you were talking about, but had some difficulty.
To repro: If I do vy on a lambda, and paste elsewhere, the cursor ends up after it (whereas usually the cursor would end up on it). If I press i to enter insert mode and then backspace, I end up in that situation.
What seems to be happening is that after the paste on a new line the cursor is in the last position of the line. evil then adjusts the cursor back one position, which is inside the composed lambda, and emacs never readjusts it outside of the lambda.
I hate to be a downer, but I think these things need fixing before we merge (and probably a fair bit more manual playing around, plus tests). lmk what you think.
No problem. I'm also going to put it behind a flag that makes it and experimental feature.