alignment problem when filenames have unicode grapheme clusters.
Far Manager version
3.0.6364
OS version
10.0.22631.4751
Other software
No response
Steps to reproduce
The main file list panel has some alignment issues, the calculation of the width of the filename seems to be wrong. Screenshot:
Same issue comes up with modal windows with same filenames, screenshot
Example filename:
Üzletágvezetői prémium szabályzat_Belényi Norman_2025 Q1_v1.docx
The hungarian unicode characters ÜűÁáőéí are usually handled correctly by FAR. I have the issue ONLY when I get a file that was created on a Mac with MS Office for Mac. Maybe the Mac does not use the proper unicode character for these, but uses a base character without accent (e.g. ouiae), and uses additional special unicode codepoints to add an accent (e.g. " ' double dots etc) to the previous char. (Sorry I may not use the correct words, codepoints, graphemes, etc -> I am not familiar with them)
Expected behavior
Proper alignent on the file list panel.
Actual behavior
Alignment errors, see screenshots in "Steps to reproduce" section.
Indeed, é in prémium is actually a grapheme cluster, e (U+0065) + ́ (U+0301).
Mac software for some reason produces these combining characters instead of precomposed ones, e.g. é (U+00E9).
You see alignment issues because in recent Windows (and/or Windows Terminal) versions Microsoft tries to properly support grapheme clusters, e.g. not only render the whole sequence as a single character, but occupy only a single cell for it, so the rest of the line inevitably shifts left. Conceptually it's the right thing to do of course, but it requires all the third party software (e.g. Far) to also fully support grapheme clusters, i.e. take character composition into account when aligning text. It's doable, but it requires a lot of work, and we're not Microsoft, so it's unlikely to happen soon.
For now we recommend disabling grapheme cluster support in Windows Terminal and/or Windows Console Host:
- For Windows Terminal it's in Settings - Compatibility - Text measurement mode - Windows Console.
- For Windows Console Host it's in Registry,
DWORD TextMeasurement = 2, in HKCU\Console and its subkeys.
After that grapheme clusters will look uglier, e.g. pré mium, but at least aligned.
Thanks for the insights. Shall I close this bug report issue, or re-label as a low-priority feature request?
I don't mind keeping it open as a reminder.
...it requires all the third party software (e.g. Far) to also fully support grapheme clusters...
Well, maybe not, not grapheme clusters. I think at least part of the problem is that what to the user looks like individual visual elements (columns, column separators, content text), internally are often rendered as a single sequence of character with separators (and other decorations if any) inserted into this sequence at the positions calculated based on the lengths of the content text strings. Now, the length of the content text strings is the issue.
We can imagine another approach to rendering, where graphical elements are drawn "faithfully." In the example panel picture above, we have panels with the border, two columns, two separator lines (maybe something else) and so many text strings which appear in the columns starting at some position and extending to the right (and why not to the left, BTW, though I do not care about RTL).
If we first render the text strings starting where they should start and letting them extend as they happen, then render all decoration on top of already laid out text overwriting and hiding everything which should not be visible, we will never need to know text string lengths.
Here I am obligated to refer to another recent @alabuzhev's comment.
@MKadaner such an approach does work of course, except when it doesn't. Imagine that you need to draw the user menu with the same file names of unknown lengths on top of the panel (e.g. F2). Or centered panel titles, or any centered dialogs (e.g. F8), or wrapped text in help or viewer and so on and so forth.
@alabuzhev, yeah, right. Well, centering can be mostly avoided (I think it was not a coincidence that windows UI shifted mainly to left-alighed strings). String boxing can be done by rendering strings where they should start, then detecting where they actually ended, and draw the right border there. Line wrapping is indeed tricky...
With the pixel-based rendering, people use platform functions to calculated string width in pixels or, if those are not available, render to a side buffer and bit-blit the result. We did it in my previous life. Something similar can probably be done with character-position-based rendering. Whether all these complications worth the effort is a different question.
String boxing can be done by rendering strings where they should start, then detecting where they actually ended, and draw the right border there
One step is missing, viz. "detecting where they actually ended." So, on the second attempt, the teal rectangle should be of the right width.
Is there a good way to measure actual width of a string in character positions on Windows?
Of course, one might say that grapheme clusters will always be visually shorter than their corresponding characters, and, by extension, the visual string length will always be smaller than string.size(), so cutting the strings as if they're in ASCII English should work, but it's not necessarily the case: a lot of single characters can occupy double (and probably more) cells, so finding out the actual visible length is inevitable.
Not to mention that this "faithful" rendering will be slow and blinky, probably slideshow-like.
So, on the second attempt, the teal rectangle should be of the right width
I was too lazy to make a gif :)
Is there a good way to measure actual width of a string in character positions on Windows?
Unfortunately, no.
Thank you for the link. It's horrible. Around 2005, proprietary niche Motorola OS (for phones) did provide applications with the documented API to measure pixel width of a given UNICODE string. And it actually worked, RTL, end-glyphs and everything. Now, 20 years since, we cannot have it on desktop in the most widely used OS.
we cannot have it on desktop in the most widely used OS
For correctness's sake, in the (most used) graphical context, we can (and have long been able to). It's only the console/terminal subsystem that is deprived.
Thank you for the correction, @HamRusTal. Sure, in the context of Far development I omitted this clarification.