xterm.js Wide nerd font characters are cut to single width

Powerlines chars were trimmed to only fill a single cell in https://github.com/xtermjs/xterm.js/pull/3279, we need to look at the character codes in nerd fonts and make sure wide ones are trimmed to 2 cells.

Actual:

Expected (Terminal.app):

VS Code issue: https://github.com/microsoft/vscode/issues/123629

May 14 '21 13:05 Tyriar

Wild guess - once again unicode v13 issue :smiley_cat:

I have outlined a few concrete ideas to overcome this in #3304.

May 14 '21 13:05 jerch

@jerch I guess it's related to that, but for these special characters we could just always override them to be wide.

Relevant code:

https://github.com/xtermjs/xterm.js/blob/a510ffb3c03c2db160579bea6fded2191ee59f0b/addons/xterm-addon-webgl/src/atlas/WebglCharAtlas.ts#L370-L382

May 14 '21 13:05 Tyriar

To me this unicode stuff feels like chasing the white rabbit - we are always late missing crucial aspects, lol.

We cannot do this on renderer level alone, as it would screw up the buffer alignment, it def. needs a fix on wcwidth level. But permanently adding it to the v6/v11 tables is imho the wrong move, if they were meant as wide in v6/v11 they already would take 2 cells there. Which leaves me thinking, that only a proper v13 table addressing all new codepoints since v11 can handle it. Other than stated in #3304 we could even start without grapheme handling to cover at least those new single codepoints. And add graphemes in a second step.

May 15 '21 06:05 jerch

Are you sure these are real unicode characters? I was under the impression nerd fonts took unused characters and leveraged the fact that terminals normally render the whole char without trimming like we do sometimes, it's only because we're trimming that this is a problem but our font rendering requires it.

May 15 '21 09:05 Tyriar

@Tyriar Well idk anything about nerd fonts internals - my guess would be that they overwrite several codepoints with their own glyph set to get the graphical output they want. Do they use unspecified areas? If so we'd need a global rule for that, but thats kinda dangerous if not backed up by some global rule from the consortium (or some limited pre-specification).

Do you have access to the bytes for the output above? Would help to clarify where those chars come from.

Edit: Found this here: https://github.com/ryanoasis/nerd-fonts/wiki/Glyph-Sets-and-Code-Points. Following the codepoint ranges there, they overload several higher BMP ranges.

Edit2: Scanned through the ranges, most are in PUA1, which is left unspecced by the consortium on purpose. Problem here - it is application dependent, lol. Not sure if it spans the whole PUA, we prolly could get away marking the whole area as wide for nerdfont. But this might raise questions regarding other use cases (see https://en.wikipedia.org/wiki/Private_Use_Areas for more overloads).

May 15 '21 11:05 jerch

@Tyriar I started playing around with the renderer and I think I fixed this and some tranparency issue in #3349, would be great it you could try it out

May 20 '21 02:05 jeanp413

Hey guys. Would you guys mind telling me what the codepoints are for those 2 glyphs being displayed in the above screenshots? I'd like to investigate a little too. (Especially: I think there should be no powerline specific code paths in any terminal emulator, but rather a generic solution for the common case). Many thanks :)

May 30 '21 21:05 christianparpart

@christianparpart No, not a powerline specific code path in any terminal emulator directly, but some way to configure it from appside. PUA codepoints can stand for anything, they are left unspecced for the purpose to get vendor/application defined. Since a terminal is the "screen part" of a console application, we would have to make it configurable from that side, so an app using nerdfont symbols can activate the appropriate unicode metrics. Or whatever app with whatever idea of PUA.

Welcome to under-specced unicode hell, and ppl using nice looking things without bothering, why it works on their OS version but nowhere else. :imp:

May 30 '21 21:05 jerch

When I worked on this, I installed powerline10k to do some tests and from what I understood the way they handle the powerline glyphs is by manually adding a space character (there is an specific option for this) so that the next glyph doesn't render over the powerline glyph :sweat_smile:. So for xterm.js I don't think we need to configure anything and let the shell or app handle that.

May 30 '21 23:05 jeanp413

@jeanp413 While the space trick would work in xterm.js, I consider this not as a good workaround, because a TE might actually have to do some erasing on an incoming whitespace again cutting things off. Furthermore the fact that this issue exists tells me, that the shell currently does not add those whitespaces. The space trick would fix the issue from the wrong side imho.

The only reliable longterm fix is to have correct unicode metrics*. For PUA this means we need some way to get that information from somewhere, as it does not exist in the unicode spec itself. This can either be by some TE extension/setting (hardcoded, not preferred as it needs user interaction), or some scriptable interface from appside (preferred, as only the app knows if it uses unspecced things). The configuration issue with PUA goes a bit further, as it is unclear, how the custom glyphs shall end up in the TE. For nerdfont and xterm.js this can be solved by provide the corresponding font, and it is the only way to do so currently. Older TEs had sequences to define soft character sets, imho most newer TEs dont implement anything like that.

[*] As of now xterm.js would just need the correct codepoint runwidths, and later on also their grapheme classification.

May 31 '21 10:05 jerch

Furthermore the fact that this issue exists tells me, that the shell currently does not add those whitespaces.

This issue is specific to the webgl renderer (there's some logic that cuts the width of a glyph to just one cell, I fixed this in my PR), powerline glyphs render correctly in the canvas renderer and the whitespace is added by the shell

May 31 '21 16:05 jeanp413

Yes the canvas renderer works around this issue with this hack: https://github.com/xtermjs/xterm.js/blob/fe1d2f6af106dd3ccb9b3c95de54d5a3b6a3f0e8/src/browser/renderer/TextRenderLayer.ts#L112-L134

Since I dont have powerline installed I cannot test the DOM renderer. So do you think creating another hack to work around the existing whitespace hack is good practice? Btw the nerdfont guys are well aware of that issue, they even have a patched wcwidth implementation linked, that correctly marks their codepoints a wide, while we see them as narrow (just tested).

To me it makes no sense trying to fix something at renderer level, when we clearly know, that the faulty handling is caused by the parser. Imho we should fix the source of the problem, not just beautify the screen output until it fits.

May 31 '21 22:05 jerch

So do you think creating another hack to work around the existing whitespace hack is good practice?

Oh, no, not at all, in fact in my PR I removed the isPowerlineGlyph check and the _findGlyphBoundingBox method and just used measureText instead. I really don't know much about font stuff, I'm just trying to fix this issue and make it work as in the native terminal which currently uses the white space hack config in powerline10k and if there's a proper fix that doesn't rely on it, that should be implemented of course.

Jun 01 '21 00:06 jeanp413

@jeanp413 the whitespace hack just makes sure something isn't printed on top of the glyph if its interpreted as a single width character. I'm not sure measureText will work since the isPowerlineGlyph workaround needs to trim pixels off of the glyph.

Jun 02 '21 13:06 Tyriar

@Tyriar With the tests I've done so far it seems to be working fine. Do you know some edge cases that I could test?

master	fix with measure text

Jun 02 '21 15:06 jeanp413

The seam there is what the cell trimming is meant to prevent:

Jun 02 '21 17:06 Tyriar

ah but that's also present in the master branch with the trimming logic:

And in the native terminal too (I'm in ubuntu) using different fonts: Screenshot from 2021-06-02 12-35-33

Jun 02 '21 17:06 jeanp413

The images look like a rounding issue at the clipping borders to me. I had the same issue in the image addon and solved it by doing a floor offset at the right border and the left border of the next cell, which moves things slightly to the left but guarantees perfect clipping. Ofc with flooring there is a chance to cut off some subpixel information at the right side, if the source data got stretched with antialias into that region. But meeh, imho this can only be partially fixed and involves much bigger atlas resolution (here glyph textures). Not sure it applies here, as the webgl interface does not have the same pixel bound restrictions.

Jun 02 '21 18:06 jerch

You draw a glyph at 0x0 and its AA will overflow to the left, we allow/want this generally for nice flowing text, but not for powerlines glyphs which need to be pixel perfect.

Jun 02 '21 18:06 Tyriar

You draw a glyph at 0x0 and its AA will overflow to the left, we allow/want this generally for nice flowing text, but not for powerlines glyphs which need to be pixel perfect.

Since Powerline glyphs are in PUA you could use that as condition, or is that a bad idea?

Also,maybe off-topic. But i would like to understand. You are using WebGL for rendering but the Webbrowser stack does the text stuff (IIRC), does this mean you can get the browsers render stack to render to texture instead of a window surface?

Jun 03 '21 07:06 christianparpart

@christianparpart what's PUA?

Yes the webgl renderer uses a 2d canvas context to render the text to a texture which is then uploaded to the GPU, so the browser does the text rendering for us.

Jun 03 '21 11:06 Tyriar

I believe this is fixed

Dec 15 '22 16:12 Tyriar

I believe this is fixed

How do you believe that? What have you done?

Sorry für the late reply. I in fact didn't notice. PUA is a Unicode range for nonstandard codepoints.

Dec 15 '22 17:12 christianparpart

A lot of changes happened in the past year and a half in all renderers, especially NF/powerline rendering. I didn't verify though (currently going through the ~1000 issues I'm responsible for).

Dec 15 '22 17:12 Tyriar

xterm.js xterm.js copied to clipboard

Wide nerd font characters are cut to single width

xterm.js
xterm.js copied to clipboard