pytermgui icon indicating copy to clipboard operation
pytermgui copied to clipboard

[BUG] Long unicode (emojis) get wrong length calculated

Open manuelF opened this issue 2 years ago • 1 comments
trafficstars

Describe the bug Using emojis (on terminals that support it) misaligns the windows due to wrong length computed (byte length vs rune length)

To Reproduce

Taking emojis from the https://en.wikipedia.org/wiki/X_mark:

import pytermgui as ptg

with ptg.WindowManager() as manager:
    manager.add(ptg.Window(
        ptg.Label("NormalLabel"),
        ptg.Label("1x Emoji Label: ❌"),
        ptg.Label("2x Emoji Label: ❌❌"),
        ptg.Label("3x Emoji Label: ❌❌❌"),
        ptg.Label("1x Normal Label: X"),
        ptg.Label("2x Normal Label: X X"),
        ptg.Label("3x Normal Label: X X X"),
    ))

Expected behavior A normal outer box.

Seen behaviour* Boxes with emojis on the line are offset, due to printing chars differently.

╔══════════════════════════════════════╗
║              NormalLabel             ║
║           1x Emoji Label: ❌          ║
║          2x Emoji Label: ❌❌          ║
║          3x Emoji Label: ❌❌❌         ║
║          1x Normal Label: X          ║
║         2x Normal Label: X X         ║
║        3x Normal Label: X X X        ║

System information

$ ptg --version

PyTermGUI version 7.4.0

System details:
    Python version: 3.8.10
    $TERM:          xterm-256color
    $COLORTERM:     None
    Color support:  ColorSystem.EIGHT_BIT
    OS Platform:    Linux-5.15.90.1-microsoft-standard-WSL2-x86_64-with-glibc2.29$ ptg --version

Possible cause Possible incorrect way of computing real_length with wide-characters.

https://github.com/bczsalba/pytermgui/blob/56b2cc1dc74ada438088719d2ebe95e21d509ad6/pytermgui/widgets/base.py#L781

Possible solution See alternatives like: https://stackoverflow.com/a/30775818

Maybe?

Thanks!

manuelF avatar Jul 16 '23 14:07 manuelF

Note that the fix provided in PR#118 (after being modified to include both sets of chars) displays correctly. https://github.com/bczsalba/pytermgui/pull/118

RE_CHINESE = re.compile(r"[\u4e00-\u9fff]")
RE_EMOJI = re.compile(r"[\u2000-\u2fff]")

[...]

@lru_cache(maxsize=None)
def real_length(text: str) -> int:
    if bool(RE_CHINESE.search(text)) or bool(RE_EMOJI.search(text)):
        return sum(wcswidth(c) for c in strip_ansi(text))    
    return len(strip_ansi(text))

manuelF avatar Jul 16 '23 15:07 manuelF