pynvim
pynvim copied to clipboard
replace_termcodes with DecodeHook fails on non-ASCII input
vim.replace_termcodes
seems to be broken when used on a vim client with a neovim.DecodeHook installed. It seems like it might be trying to double-decode input or something.
An example from maktaba:
>>> vim.replace_termcodes(u":let g:weirdpath = maktaba#path#Join([g:repo, 'weird¬p…l✓u↓g⏎i‽n'])<CR>")
b":let g:weirdpath = maktaba#path#Join([g:repo, 'weird\xc2\xacp\xe2\x80\xfeX\xa6l\xe2\x9c\x93u\xe2\x86\x93g\xe2\x8f\x8ei\xe2\x80\xfeX\xbdn'])\r"
>>> vim.with_hook(neovim.DecodeHook()).replace_termcodes(u":let g:weirdpath = maktaba#path#Join([g:repo, 'weird¬p…l✓u↓g⏎i‽n'])<CR>")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/dbarnett/.local/lib/python3.4/site-packages/neovim/api/nvim.py", line 168, in replace_termcodes
from_part, do_lt, special)
File "/home/dbarnett/.local/lib/python3.4/site-packages/neovim/api/common.py", line 213, in request
'out-request')
File "/home/dbarnett/.local/lib/python3.4/site-packages/neovim/api/common.py", line 240, in walk
return fn(obj, *args)
File "/home/dbarnett/.local/lib/python3.4/site-packages/neovim/api/common.py", line 148, in <lambda>
return lambda o, s, m, k: f1(f2(o, s, m, k), s, m, k)
File "/home/dbarnett/.local/lib/python3.4/site-packages/neovim/api/common.py", line 170, in _decode_if_bytes
return obj.decode(self.encoding, errors=self.encoding_errors)
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 55-56: invalid continuation byte
I tried some other variations of passing pre-encoded bytes and such, and couldn't find anything that wouldn't blow up.
It would also be good to get unicode strings out at least when passing unicode strings in.
Ahhh, it actually makes some sense that replace_termcodes would return invalid strings. The returned value is the internally representation Vim uses, i.e. there may be some escaping going around.
Quick fix is to call replace_termcodes from a session without DecodeHook. The "right way(TM)" is to have replace_termcodes return a new Binary type that is not treated as a decodable string.
The "right way(TM)" is to have replace_termcodes return a new Binary type that is not treated as a decodable string.
Can't that be done with another SessionHook? For example, see how the ScriptHost class uses a hook to emulate the legacy behavior of eval
@tarruda Yes, I created a pull request google/vroom#78 that keeps two Nvim objects one with and one without the DecodeHook. Seems to work as intended.
Sounds fine as a workaround. Is it possible to make this less brittle? I don't understand where the decoding problem arises, but it seems like python should have enough context to DTRT.
Sounds fine as a workaround. Is it possible to make this less brittle? I don't understand where the decoding problem arises, but it seems like python should have enough context to DTRT.
Possible yes, but not on the short run. As DTRT goes, replace_termcodes()
definitely returns invalid strings by design - but since the DecodeHook tries to convert all binary strings into Unicode it causes the error. At this point it is not possible to enable/disable the DecodeHook for each function call.
With the latest changes we could change replace_termcodes to always return bytes