helix icon indicating copy to clipboard operation
helix copied to clipboard

Digraphs and Unicode input tools

Open Anderssorby opened this issue 3 years ago • 11 comments

In vim you can input some unicode characters by pressing Ctrl + k in insert mode and then typing to chars like *l for λ. There are other ways to, but this is very convenient and customizable.

Anderssorby avatar Jan 04 '22 13:01 Anderssorby

How should this work? Should the typed characters appear in the buffer, and then be replace by the Unicode character (this is how it works in vim), or should they appear at the bottom of the screen (where numbers appear before an action)? Also what should the default keyboard shortcut be (Ctrl + k is already used)?

A-Walrus avatar Jun 19 '22 10:06 A-Walrus

Another nice way to insert unicode symbols is how Emacs' input methods work. E.g. if I've the Agda input method active in the current Emacs buffer I can write the TeX command for a symbol, and the literal (i.e. the symbol) TeX command symbol get replaced with the unicode char it corresponds to. It's very nice when you write some more math heavy texts.

a12l avatar Feb 21 '23 17:02 a12l

Another option which would be more helix-y would be to do a text search for unicode symbols.

Anderssorby avatar Feb 27 '23 08:02 Anderssorby

So we have four columns for a text search - the unicode code point (for #4216), the literal character (escaped for control chars or modifiers), compose sequences (for this issue), and a description of the character (e.g. column 2 or 11 of UnicodeData.txt). For example:

0009  <TAB>  HT  Character Tabulation
00fe  þ      th  Latin Small Letter Thorn
093f  ि     *i  Devanagari Vowel Sign I

imuli avatar Apr 20 '23 12:04 imuli

imho. it's custom snippets + multi language server feature.

I was trying implements snippets for my words completion language server https://github.com/estin/simple-completion-language-server

And found that language server with snippets support may be useful for this goals Each input must be separate by space to process them as words, this behavior may be fixed.

https://github.com/helix-editor/helix/assets/520814/ed177572-a111-4702-9a91-3cdb9bbe2e40

estin avatar Jul 29 '23 14:07 estin

Another option which would be more helix-y would be to do a text search for unicode symbols.

I like this way of doing it

Unicode input is very important to a package I'm writing, so I really hope this feature is implemented.

EDIT: typo

jakobjpeters avatar Aug 29 '23 00:08 jakobjpeters

Or how about US-ASCII codes for the character separators:

1c  FS  ␜  ^\  File Separator
1d  GS  ␝  ^]  Group Separator
1e  RS  ␞  ^^  Record Separator
1f  US  ␟  ^_  Unit Separator

In vim and bash we can use ctrl-v + ctrl-^ for example to enter a "record separator" and bash will display ^^ for that.

Would be nice with csv files.

bound-variable avatar Sep 04 '23 21:09 bound-variable

I am a a bit concerned about basing it on a US standard, given that their are non-US countries and the US doesn't have a great track record cough imperial units cough. However, I am from the US and am not knowledgeable on what standards are out there. Is US-ASCII the standard internationally or are their competing ideas? If it is indeed an internationally used standard, my concern above is not relevant.

jakobjpeters avatar Sep 04 '23 23:09 jakobjpeters

Another option which would be more helix-y would be to do a text search for unicode symbols.

I like this way of doing it

Unicode input is very important to a package I'm writing, so I really hope this feature is implemented.

EDIT: typo

Kitty terminal implements this very nicely and does feel more helix-y

trzza avatar Oct 08 '23 20:10 trzza

I think it is more powerful to put this functionality in the terminal emulator, or even better, in the OS level. Using WinCompose/XCompose you can handle unicode keyboard input pretty well, and this carries over to all other applications automatically. Keeping it confined to a (single) text editor is quite restrictive, though it may be useful for some language specific control.

chtenb avatar Oct 09 '23 07:10 chtenb

Something like this would be very useful, as languages like Julia and Lean both allow the use of Unicode characters in identifiers. The Julia language extension for Vim (https://github.com/JuliaEditorSupport/julia-vim) has addressed this by allowing tab-completions of LaTe\Chi-like strings in insert mode, and I believe Lean does something similar, although their tab-completion table is not based on LaTeX.

I believe such a "macro system" would be the most ergonomic way of typing Unicode characters. The backslash key is probably not the most ergonomic modifier key, though.

SeSodesa avatar Jul 19 '24 04:07 SeSodesa

I think it would be very nice to have an insert mode shortcut (e.g. ctrl-something) which brings up a fuzzy-find window to look for unicode characters similarly to how you'd look for files with space-f.

aris-mav avatar Mar 21 '25 22:03 aris-mav

One thing that occurred to me is that Typst's Codex might be used for this: https://github.com/typst/codex. It's a Rust library for entering Unicode symbols in a human-readable manner.

SeSodesa avatar Mar 28 '25 18:03 SeSodesa

One thing that occurred to me is that Typst's Codex might be used for this: https://github.com/typst/codex. It's a Rust library for entering Unicode symbols in a human-readable manner.

This is something I am currently doing with simple-completion-language-server mentioned above. For anyone interested in a short-term solution, I quickly put together the following code to generate the ~/.config/helix/unicode-input/base.toml file from codex.

use std::collections::BTreeMap;

fn main() {
    let mut symbols = BTreeMap::new();

    for (name, binding) in codex::SYM.iter() {
        let codex::Def::Symbol(symbol) = binding.def else { continue };
        if let codex::Symbol::Single(single) = symbol {
            symbols.insert(
                name.to_string(),
                single,
            );
        } else if let codex::Symbol::Multi(multi) = symbol {
            for (child, character) in multi {
                symbols.insert(
                    if child == &"" {
                        name.to_string()
                    } else {
                        format!("{}.{}", name, child)
                    },
                    *character,
                );
            }
        }
    }

    print!("{}", toml::to_string(&symbols).unwrap());
}

Here is the dependency list.

codex = "0.1.1"
toml = "0.8.20"

aaron-jack-manning avatar Mar 28 '25 22:03 aaron-jack-manning

TLDR: This feature fits best in Helix itself.

I think it is more powerful to put this functionality in the terminal emulator, or even better, in the OS level.

In the OS -- more powerful: hmmm. Maybe. But it would break the terminal experience in many cases! Imagine this: a user wants to enter a unicode character a macOS or Windows. To do so, they press a OS-specific hotkey and a dialog pops up. That would make an OS-specific modal window or similar pop up on top of the Helix TUI. (Do you see any other way?) If that's how it plays out, that's a gross UI experience. The people want the terminal, after all :) 🧑‍🤝‍🧑

In terminal emulator: no, I don't think it would be more powerful. I say this because I don't think a terminal emulator could offer a good user experience doing it; it couldn't offer a textual menu nor fuzzy search, could it? I suppose it could summon an OS-specific dialog box from the depths infernal. OS-specific windows + TUI = 🤕 -- which follows the same concern I mentioned in the above paragraph.

Bottom line, IMO: the current Helix user experience could be improved by addressing this in Helix.

Waiting for another layer to provide it in a way that feels Helixy 🧬 is optimistic at best.

Keeping it confined to a (single) text editor is quite restrictive, though it may be useful for some language specific control.

There is no need to keep this "confined". Helix can provide it and people who want to use it can. If an OS wants to provide it and people like that, great. If some terminal emulator makes it happen, all good.

xpe avatar Jul 31 '25 16:07 xpe

But it would break the terminal experience in many cases! Imagine this: a user wants to enter a unicode character a macOS or Windows. To do so, they press a OS-specific hotkey and a dialog pops up.

Not at all. Have you ever used XCompose? (WinCompose on windows) Usually there is a single designated compose key (right-alt is often picked for this). Unicode characters, or really any string of characters, are then inserted through certain key combinations. The key combinations are fully configurable. Read https://wiki.portal.chalmers.se/agda/Main/XCompose

I use this on windows and linux, and the mappings carry over to any application. Personally I type a lot of unicode math, both in Helix and in Discord.

chtenb avatar Jul 31 '25 18:07 chtenb

I do confirm that Helix does not support the "compose" key at this stage, which is really unfortunate for a text editor.

Under Linux and X11, these shortcuts are listed in this file:

hx /usr/share/X11/locale/en_US.UTF-8/Compose

For example we get all currencies signs, such as <compose> = l does output £, <compose> = e does output €.

But not in Helix (compose key does increment one block but output nothing). This is a major issue.

webdev23 avatar Aug 22 '25 11:08 webdev23

@webdev23 That makes no sense to me. The compose key is not something an application has to support, it should work in all applications. Moreover, it works in Helix for me under both Windows and Linux. I would suggest you troubleshoot further into your specific setup, like terminal emulator, etc.

chtenb avatar Aug 22 '25 14:08 chtenb

You are right, the issue is tmux in this case, this does however show that i am loosing it as so many bugs everywhere that are years old aren't building a trustable dev environment at all.

webdev23 avatar Aug 22 '25 16:08 webdev23

In the Helix supported terminal Kitty inserting arbitrary utf-8 encoded text can be done very easily. So not only VIM digraphs.

See Send arbitrary text on key presses in the kitty.conf section of documentations.

MarHaj avatar Nov 02 '25 12:11 MarHaj