evil
evil copied to clipboard
use the `inherit-input-method` argument of read-char
Supercedes #857.
I don't quite understand what this fixes and get no different behavior from testing with the "c" interactive code changed. Can you provide a complete repro of an usecase, show how it goes wrong and how it should be? Also, provide details about your setup including the terminal emulator, just as with the bug report template.
edit: From a cursory search, read-char
is used for reading markers (which should be alphanumerical), registers (a bit more than alphanumerical) and an unused interactive code. Given that Evil tries emulating Vim as faithfully as possible, I'm inclined to close this in favor of restricting what's read here.
Issue type
- Bug report
Environment
Emacs version: 26.0.50 Operating System: Arch Linux Evil version: current master (f371f2c6fdd57a939502f5032a8d93cd71493d0d), but also the latest melpa version, and earlier versions Evil installation type: manual, but also melpa Graphical/Terminal: Terminal emulator in X11 -- reproducible with xterm, konsole, gnome-terminal, etc Terminal multiplexer: It happens in tmux as well as in the terminal emulators mentioned above.
Reproduction steps
- Start Emacs (in terminal mode)
- Set your keyboard to "eu" with
setxkbmap eu
, or use any other keyboard layout with non-ascii unicode characters. - In insert mode type
»
(» is typed on theeu
layout by pressing right-alt+]). Note that the key is now in the buffer (no mojibake). - Go to normal state.
- Type
m»
. - Type
j
ork
to move a line. Note that you move, but\273
is put in the buffer.
To more clearly see what is happening, paste in this code:
(defun read-char-test (char)
(interactive "c")
(if (equal char (char-after))
(message "char given (%c) and char-after (%c) are equal" char (char-after))
(message "char given (%c) and char-after (%c) differ" char (char-after))))
- Eval that defun.
- Move your cursor on to the
»
character. -
M-x read-char-test
, then type»
- You get the message "char given (Â) and char-after (») differ".
- Type
k
to move up a line. - Note that
\273
is put in the buffer and you move up a line.
Expected behavior
read-char
should read the whole »
character and not create mojibake.
Actual behavior
read-char
is only getting part of the character you type, and then dumping the rest into the buffer as mojibake.
Further notes
This is most saliently an issue for evil-surround
, which uses (interactive "c")
in some places. I try to use it with non-ascii delimiters like «»
and it fails. But this is an issue with anything that uses (interactive "c")
and with other things that use (read-char)
like setting and going to marks (it's not a big deal for me there, but if I were to, say, remap evil commands to work on a cyrillic or greek keyboard layout it would be a serious problem).
Hm, I can mostly reproduce this, except mojibake being inserted into the buffer, it's only echoed here. For the record, if you try doing the same thing with Vim, it silently ignores m»
and gives you an error if you attempt jumping to it with '»
.
I'm not terribly surprised that reading in a non-ascii character is tricky to do right, I've implemented reading in UTF-8 myself for fun. What surprised me though is that the flag for doing this hides behind the inherit-input-method
flag. I'm not using any I'm aware of, yet this makes translation of multibyte input into the correct character work. This conflation of multibyte input and input methods implemented in elisp doesn't sit right with me at all.
Yeah, the conflation doesn't make much sense to me either. But regardless whether or not »
and other punctuation should be ignored as a mark name, certainly λ
or и
should work for marks. Any non-latin alphanumeric character will have the same issue. And personally I care a lot more about (interactive "c")
than about marks. read-key
gets the right character, but it seems to cause issues (specifically evil-surround not being able to use .
to repeat). Perhaps read-key
is correct but there is some other bug that interacts with it somehow to cause those issues.
@willghatch will this PR fix the following issue: with non English input method (Bulgarian in my case), f f
searches for the letter f
instead of ф
(for Bulgarian input method)?
Edit: as you've noted, this issue seems related https://github.com/emacs-evil/evil/issues/605
@edkolev I'm afraid not. Doing a little bit of testing, if I'm in input-state
and run read-char
with inherit-input-method
as a true value it does read ф
, but if I'm in normal-state it still reads f
. I don't know much about the guts of evil-mode, so I couldn't say why.
Just as I did in #857, I'd like to warn that this read-key
thing was the main culprit of a very painful regression on evil-surround which I had to revert. It broke the .
operation on evil-surround, which is the main feature of evil-surround.
let us be very careful about merging this in.
I've dug in a bit more (because this issue has been bugging me), and found the root cause behind .
breaking in evil-surround, and it's a bug in evil-repeat.el
. Basically, read-key
and read-key-sequence
both change the result of future calls to this-command-keys
. Evil advises read-key-sequence
, but not read-key
, which causes the use of read-key
to break the key recording for evil-repeat
. Ultimately I think #857 was really correct, and that the problems with it are caused by this evil-repeat
bug. I'm adding a new PR (#865) addressing this.
Why read-key
and read-key-sequence
affect this-command-keys
and not read-char
I'm not sure, but read-key
is semantically more correct for the uses that these patches have tried to correct.