love Text width is calculated incorrectly when highpdi is enabled in Love 12.0

The text width is calculated incorrectly when highpdi is enabled. Here are the examples I tested with the default font:

print(font:getWidth('a')) -- 7
print(font:getWidth(string.byte('a'))) -- 8

local rasterizer = love.font.newRasterizer()
print(math.floor(rasterizer:getGlyphData('a'):getAdvance() / love.window.getDPIScale() + 0.5)) -- 8 (7.6 float)

And an example of another mistake:

print(font:getKerning('f', 'f')) -- 0 (Ensure the kerning is zero)

love.graphics.print(font:getWidth('ffff'), 0, 0) -- 18
love.graphics.print(font:getWidth('f'), 0, 10) -- 4

In computeGlyphPositions function string 'ffff' splits into 'ff' pairs, so when calculating the width love code calculates and rounds the value for a pair of characters like floorf((glyphpos.x_advance >> 6) / dpiScales[0] + 0.5f). That's why there is a difference if you calculate each character individually.

Apr 24 '23 09:04 modcos

In the first example I believe 7 is more correct than 8 - the text shaper (Harfbuzz) produces an un-rounded value of 7.2. GlyphData:getAdvance and the other variant of Font:getWidth don't go through the text shaper.

I'll do more investigation when I have more time but it's possible the end result to fix this issue will be using the (new) text shaper in more places as well as making APIs to encourage people to use it rather than going behind its back and trying to calculate things manually.

Apr 24 '23 23:04 slime73

Thanks for the reply. I want to implement ui for text input, text selection and moving the cursor through the text. I don't know if love has a function that can return the character and the position of the character, relative to a given width (from the beginning of the text).

text_offset

Here's an example of what I use glyph data for. For the same reason, I only use getting the size for a single character.

I'm sorry if I don't explain well and make mistakes in the text. I do not speak English very well.

Apr 25 '23 00:04 modcos

I don't know if love has a function that can return the character and the position of the character, relative to a given width (from the beginning of the text).

It doesn't yet, but it's planned for 12.0. :)

Apr 25 '23 00:04 slime73

I figured out the reason why 'ffff' splits into 'ff' pairs in Harfbuzz.

"This is the result of ligature glyph substitutions. OpenType spec recommends that Standard Ligatures be applied by default, that's exactly what Harfbuzz is doing."

Here's more information: https://stackoverflow.com/questions/62552451/why-harfbuzz-shape-2-single-char-into-one-glyph https://harfbuzz.github.io/shaping-opentype-features.html

Apr 25 '23 17:04 modcos

I found another strange behavior:

local font = love.graphics.getFont()
local _, wrapped = font:getWrap('Приветff', 1000)
print(#wrapped) -- 2
local _, wrapped = font:getWrap('Приветfff', 1000)
print(#wrapped) -- 1

Jul 22 '23 08:07 modcos

I found another strange behavior:

Just documenting this here for the future since I haven't figured out the right solution yet: this is caused by the HarfbuzzShaper::computeWordWrapIndex code using harfbuzz' cluster API to get back the original codepoint indices for a string when determining where to wrap, without accounting for clusters that are the result of multiple codepoint characters combined into one glyph (like the ff in the above code). The trouble is that harfbuzz doesn't have anything I can find to easily determine the last codepoint index in a cluster, only the first...

Sep 04 '23 23:09 slime73

The number of character in a cluster can be clavulated by checking the cluster of the next glyph, see https://harfbuzz.github.io/a-clustering-example-for-levels-0-and-1.html

Sep 05 '23 01:09 khaledhosny

and https://harfbuzz.github.io/clusters.html

Sep 05 '23 01:09 khaledhosny

Yeah, it gets a little complicated in user code when there are multiple fonts and multiple runs of text though, because the next glyph isn't available from an index increment in that case. It would be way simpler (and easier to understand what's going on in general) if harfbuzz provided a start/count or begin/end range instead of just the start, but alas...

Sep 05 '23 01:09 slime73

It is little complicated, but I don’t see why multiple fonts makes it more so. HarfBuzz works on one run at time and all output values are local to that, the other runs don’t concern the current run.

Sep 05 '23 01:09 khaledhosny

Providing a range wouldn’t be any easier because the relationship is more complicated than that. When there is decomposition, multiple glyphs can have the same cluster, and with re-ordering you can have a multi-character multi-glyph cluster (glyphs G to G+N1 belong to characters C to C+N2).

Sep 05 '23 01:09 khaledhosny

HarfBuzz works on one run at time and all output values are local to that, the other runs don’t concern the current run.

That's not the case in this situation, where you can't know the total number of input unicode values in the last glyph of a run without knowing where the next run starts. Unless I'm missing an API?

Sep 05 '23 01:09 slime73

That is correct, so the mapping of glyphs to input character indices should ideally be done right after shaping where all the needed information is present.

Sep 05 '23 01:09 khaledhosny

Right, this adds complexity to user code where it doesn't seem like it would be strictly necessary if harfbuzz had slight changes in its APIs. Harfbuzz provides an incomplete mapping of glyphs to character indices, so I have to add user code to make the mapping complete.

As you can see, this has already resulted in bugs :(

Sep 05 '23 01:09 slime73

HarfBuzz is API and ABI stable, so no change is possible (only new APIs), but also there is no better solution. The glyph <-> character relationship can be one to one, one to many, many to one and many to many. The current API can already handle all of this at the expense of slightly complicated user code. Any new API would need to handle all of it as well.

Sep 05 '23 01:09 khaledhosny

# one to one
G1 G2 G3 G4
C1 C2 C3 C4

# one to many
G1 G2
C1 C4

# many to one
G1 G2 G3 G4
C1 C1 C2 C3

# many to many
G1 G2 G3 G4
C1 C1 C1 C5

Sep 05 '23 02:09 khaledhosny

I understand that released code is hard to change, but that doesn't mean an API is the best version of what it can be, just that there's more friction to any improvement.

The current cluster API is confusing and complicated to work with, which ultimately makes for buggier programs. I can spend time solving those bugs in my code, but to me the fact that I have to is a sign that something in the library is worth improving. I can think of a few possible changes to harfbuzz' APIs that would probably work in all your examples, if I think a bit broader than what's there right now. :)

Sep 05 '23 02:09 slime73

Please forward you suggestions to https://github.com/harfbuzz/harfbuzz/issues

Sep 05 '23 02:09 khaledhosny