pytermgui
pytermgui copied to clipboard
[BUG] Long unicode (emojis) get wrong length calculated
Describe the bug Using emojis (on terminals that support it) misaligns the windows due to wrong length computed (byte length vs rune length)
To Reproduce
Taking emojis from the https://en.wikipedia.org/wiki/X_mark:
import pytermgui as ptg
with ptg.WindowManager() as manager:
manager.add(ptg.Window(
ptg.Label("NormalLabel"),
ptg.Label("1x Emoji Label: ❌"),
ptg.Label("2x Emoji Label: ❌❌"),
ptg.Label("3x Emoji Label: ❌❌❌"),
ptg.Label("1x Normal Label: X"),
ptg.Label("2x Normal Label: X X"),
ptg.Label("3x Normal Label: X X X"),
))
Expected behavior A normal outer box.
Seen behaviour* Boxes with emojis on the line are offset, due to printing chars differently.
╔══════════════════════════════════════╗
║ NormalLabel ║
║ 1x Emoji Label: ❌ ║
║ 2x Emoji Label: ❌❌ ║
║ 3x Emoji Label: ❌❌❌ ║
║ 1x Normal Label: X ║
║ 2x Normal Label: X X ║
║ 3x Normal Label: X X X ║
System information
$ ptg --version
PyTermGUI version 7.4.0
System details:
Python version: 3.8.10
$TERM: xterm-256color
$COLORTERM: None
Color support: ColorSystem.EIGHT_BIT
OS Platform: Linux-5.15.90.1-microsoft-standard-WSL2-x86_64-with-glibc2.29$ ptg --version
Possible cause
Possible incorrect way of computing real_length with wide-characters.
https://github.com/bczsalba/pytermgui/blob/56b2cc1dc74ada438088719d2ebe95e21d509ad6/pytermgui/widgets/base.py#L781
Possible solution See alternatives like: https://stackoverflow.com/a/30775818
Maybe?
Thanks!
Note that the fix provided in PR#118 (after being modified to include both sets of chars) displays correctly. https://github.com/bczsalba/pytermgui/pull/118
RE_CHINESE = re.compile(r"[\u4e00-\u9fff]")
RE_EMOJI = re.compile(r"[\u2000-\u2fff]")
[...]
@lru_cache(maxsize=None)
def real_length(text: str) -> int:
if bool(RE_CHINESE.search(text)) or bool(RE_EMOJI.search(text)):
return sum(wcswidth(c) for c in strip_ansi(text))
return len(strip_ansi(text))