opentui icon indicating copy to clipboard operation
opentui copied to clipboard

bug: CJK char corruption in text rendering

Open zenyr opened this issue 1 month ago • 17 comments

Problem

Both Solid and React reconcilers corrupt certain CJK characters during rendering. ~Core handles them correctly.~

Expected → Actual (Solid/React)
你好世界 → 你好世ç界
中文   → 中文æ
한글   → í한글

Emoji & ASCII unaffected.

Root Cause

Encoding/displayWidth mismatch in JSX→TextNode conversion path. Likely culprit: reconciler text handling vs core displayWidth calc.

Investigation Notes

  • Core library handles all CJK/emoji correctly
  • Issue affects both Solid and React reconciler text rendering paths
  • Reproduction: specific CJK chars render as different glyphs

Next Steps

  1. Debug TextNode creation from Solid/React JSX
  2. Verify displayWidth consistency with core
  3. Check for encoding issues in reconciler text handling

Created with assistance from OpenCode & Claude Haiku 4.5

zenyr avatar Nov 03 '25 03:11 zenyr

Thanks for looking into this. I think the root cause for something like this would rather be in the renderer.zig for the ansi output, or width calculations in utf8.zig. Setting a default background prevents text from ever being rendered on transparent terminal background.

kommander avatar Nov 03 '25 11:11 kommander

Image

I can't seem to be able to reproduce that. What terminal are you using?

kommander avatar Nov 03 '25 14:11 kommander

Oh, okay. I'll attach my system configuration here: I also found this issue on Windows 11 Powershell / CMD environment too. I've made a tangent PR (adding React test suite with skipped tests) on #262.

❯ fastfetch
                     ..'          jinhyeok
                 ,xNMM.           -----------------------------------------
               .OMMMMo            OS: macOS Sequoia 15.6.1 arm64
               lMM"               Host: MacBook Pro (14-inch, 2023)
     .;loddo:.  .olloddol;.       Kernel: Darwin 24.6.0
   cKMMMMMMMMMMNWMMMMMMMMMM0:     Uptime: 7 days, 19 hours, 46 mins
 .KMMMMMMMMMMMMMMMMMMMMMMMWd.     Packages: 230 (brew), 16 (brew-cask)
 XMMMMMMMMMMMMMMMMMMMMMMMX.       Shell: zsh 5.9
;MMMMMMMMMMMMMMMMMMMMMMMM:        Display (Color LCD): 3024x1964 @ 120 Hz (as 1512x982) in 14" [Built-in] *
:MMMMMMMMMMMMMMMMMMMMMMMM:        Display (RTK UHD HDR): 3360x1890 @ 60 Hz (as 1680x945) in 24" [External]
.MMMMMMMMMMMMMMMMMMMMMMMMX.       Display (LG HDR 4K): 3840x2160 @ 60 Hz (as 1920x1080) in 27" [External]
 kMMMMMMMMMMMMMMMMMMMMMMMMWd.     Display (LG HDR 4K): 3840x2160 @ 60 Hz (as 1920x1080) in 27" [External]
 'XMMMMMMMMMMMMMMMMMMMMMMMMMMk    DE: Aqua
  'XMMMMMMMMMMMMMMMMMMMMMMMMK.    WM: Quartz Compositor 278.4.7
    kMMMMMMMMMMMMMMMMMMMMMMd      WM Theme: Multicolor (Light)
     ;KMMMMMMMWXXWMMMMMMMk.       Font: .AppleSystemUIFont [System], Helvetica [User]
       "cooc*"    "*coo'"         Cursor: Fill - Black, Outline - White (32px)
                                  Terminal: tmux 3.4
                                  CPU: Apple M2 Max (12) @ 3.50 GHz
                                  GPU: Apple M2 Max (38) @ 1.40 GHz [Integrated]
❯ ghostty --version
Ghostty 1.3.0-main+3f75c66e8

Version
  - version: 1.3.0-main+3f75c66e8
  - channel: tip
Build Config
  - Zig version   : 0.15.2
  - build mode    : .ReleaseFast
  - app runtime   : .none
  - font engine   : .coretext
  - renderer      : renderer.generic.Renderer(renderer.Metal)
  - libxev        : kqueue

Update:

  • I'm not sure if this is related but this happened twice in a row for me, which is weird. (that t)
Image

zenyr avatar Nov 04 '25 05:11 zenyr

Ahh tmux might be the culprit, it has some Unicode quirks, will try to reproduce.

The issue with the overlapping "t" is a different one, that needs flexShrink=0 and is handled in an issue on opencode.

kommander avatar Nov 04 '25 13:11 kommander

tmux might be the culprit

Hmm while I do agree with the reasoning I could 100% reproduce this on my Windows 11 device, which does not have any tmux related stuff available.

zenyr avatar Nov 05 '25 01:11 zenyr

Yes, the windows terminals calculate Unicode width differently and not complete to the standard as well, like tmux. Proper Unicode support is lacking in many implementations unfortunately.

kommander avatar Nov 05 '25 14:11 kommander

Image I saw some of it happening. Happens at wrapping break points, it might calculate the byte offset for the chars wrong there. Similarly, for streaming content when the text is incomplete it might have bytes until the middle of a grapheme.

kommander avatar Nov 06 '25 00:11 kommander

Ah sure, Windows native terminal is notorious for CJK users already :)

Hoever,I think I could reliably reproduce grapheme error, with or without chunk streaming.

For example:

Image

알겠습니다. Task 에이전트에 ktlint + detekt 검사를 위임하겠습니다.

This rendering issue persists regardless of being streamed or not. This is another screenshot of the same session loaded without tmux: (opencode --continue)

Image

zenyr avatar Nov 06 '25 01:11 zenyr