atuin
atuin copied to clipboard
[Bug]: Chinese input methods cause errors when entering Chinese content in the Webstorm terminal.
What did you expect to happen?
I can't tell if the problem is webstorm or atuin, I've committed issuse in webstorm
https://youtrack.jetbrains.com/issue/WEB-67048/Chinese-input-method-causes-other-program-errors-when-entering-Chinese-content-in-the-terminal
What happened?
This problem occurs when typing Chinese content with Chinese input methods, and only occurs in the webstorm terminal.
https://github.com/atuinsh/atuin/assets/31089228/2c79c7e5-af11-442e-b1c1-a9c3e89ba0b8
❯ �<0080>thread 'main' panicked at library/std/src/env.rs:171:83:
called `Result::unwrap()` on an `Err` value: "\xE3\x80"
stack backtrace:
0: _rust_begin_unwind
1: core::panicking::panic_fmt
2: core::result::unwrap_failed
3: <std::env::Vars as core::iter::traits::iterator::Iterator>::next
4: <config::env::Environment as config::source::Source>::collect
5: config::source::Source::collect_to
6: <[alloc::boxed::Box<dyn config::source::Source+core::marker::Send+core::marker::Sync>] as config::source::Source>::collect
7: config::builder::ConfigBuilder<config::builder::DefaultState>::build_internal
8: config::builder::ConfigBuilder<config::builder::DefaultState>::build
9: atuin_client::settings::Settings::new
10: atuin::command::client::Cmd::run
11: atuin::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. � thread 'main' panicked at library/std/src/env.rs:171:83:
called `Result::unwrap()` on an `Err` value: "\xE3"
stack backtrace:
0: _rust_begin_unwind
1: core::panicking::panic_fmt
2: core::result::unwrap_failed
3: <std::env::Vars as core::iter::traits::iterator::Iterator>::next
4: <config::env::Environment as config::source::Source>::collect
5: config::source::Source::collect_to
6: <[alloc::boxed::Box<dyn config::source::Source+core::marker::Send+core::marker::Sync>] as config::source::Source>::collect
7: config::builder::ConfigBuilder<config::builder::DefaultState>::build_internal
8: config::builder::ConfigBuilder<config::builder::DefaultState>::build
9: atuin_client::settings::Settings::new
10: atuin::command::client::Cmd::run
11: atuin::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. �<0080><0082>/
Atuin doctor output
Atuin Doctor
Checking for diagnostics
Please include the output below with any bug reports or issues
atuin:
version: 18.2.0
sync: null
shell:
name: zsh
default: zsh
plugins:
- atuin
system:
os: Darwin
arch: arm64
version: 14.4.1
disks:
- name: Macintosh HD
filesystem: apfs
- name: Macintosh HD
filesystem: apfs
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Judging by the issue you've opened with webstorm, and that this doesn't occur in other terminals, I suspect it's just a webstorm problem. I haven't heard of anyone else having problems with Chinese input methods
The panic is reproducible with the following command.
$ a='\x80' atuin search
thread 'main' panicked at library/std/src/env.rs:171:83:
called `Result::unwrap()` on an `Err` value: "\x80"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
This is probably related: https://doc.rust-lang.org/std/env/fn.vars.html#panics
This means that any Rust programs using std.env.vars
somewhere will panic if any environment variables (including the ones that the program doesn't use) contain binary data or a string in a different encoding.
This implies that when a user switches the locale by setting e.g. LANG
, LC_CTYPE
, or LC_ALL
, all the environment variables need to be cleared for Rust programs to work. This is a strange requirement by Rust.
Hm that doesn't reproduce the issue for me 🤔
Is there a specific terminal or setup you're using?
- I tried release version of v18.2.0 and both the debug/release builds of commit eebfd04. In all the binaries I could test, the problem reproduces. However, the current
main
branch doesn't compile in my host, so I haven't tested the latestmain
branch. - I don't think the terminal is related when using the above command (because the part
a=$'\x80'
mimics the terminal's behavior in the report), but I tried different terminals. The panic reproduces with all the terminals I quickly tried, including GNU Screen, Terminology, and Mintty, Alacritty, and Kitty. - The panic reproduces with all of the cases
LC_CTYPE='en_US.UTF-8'
,LC_CTYPE='ja_JP.UTF-8'
,LC_CTYPE='C.UTF-8'
, andLC_CTYPE='C'
- The panic also reproduces with the default
.config/atuin/config.toml
(i.e., the one generated when it is missing). - The panic reproduces also with an empty command history.
- I also tried it with Zsh, Bash/bash-preexec, and Bash/ble.sh. In all the cases, the problem reproduces.
Ah I see, got it. Thanks
This implies that when a user switches the locale by setting e.g. LANG, LC_CTYPE, or LC_ALL, all the environment variables need to be cleared for Rust programs to work. This is a strange requirement by Rust.
It's actually just the usage of std::env::vars, which uses a String. These require utf8 encoding.
std::env::vars_os
uses an OsString, which does not. We could probably wrap this and then only use the values internally that are valid utf-8, and ignore the rest rather than panicking
Regardless of this, I'm still curious as to why this only happens with webstorm (for OP)
This problem occurs when typing Chinese content with Chinese input methods, and only occurs in the webstorm terminal.
Regardless of this, I'm still curious as to why this only happens with webstorm (for OP)
This problem occurs when typing Chinese content with Chinese input methods, and only occurs in the webstorm terminal.
This is just my guess, but I think the WebStorm terminal sends the user input with an encoding different from UTF-8. Then, the shell receives the data and stores it in an environment variable. This behavior might be caused by just the user's configuration for the terminal encoding, or it might be the WebStorm terminal's issue with handling the data from the system's input method.
So, I think the reported problem could be solved if the WebStorm terminal is properly configured or fixed. Nevertheless, I see it as also a problem that crate config
scans the environment variables that the program doesn't even use and panics with a non-UTF-8 environment variable unused by the program. Atuin shouldn't be affected by random environment variables that it doesn't use.
For the non-UTF-8 environment variables that Atuin actually uses, I'm not sure about the desired behavior. It may just ignore the environment variable or print a warning message.