love icon indicating copy to clipboard operation
love copied to clipboard

[12.0] love.graphics.print(): misordered characters when using fallback fonts

Open frank-f-trafton opened this issue 2 years ago • 4 comments

Version: 12.0-development (8951635)

Repro: 12-fallbacks.zip

Problem: When drawing text with love.graphics.print() using a font that has been configured with fallbacks, glyphs originating from the fallback fonts may appear in the wrong place.

Expected appearance of text, using NotoSans with NotoSansSymbol2 as a fallback:

a⇨b⇨c The⇨quick⇨brown⇨fox a⏎a⏎⇨⏎

Observed:

a⇨⇨bc The⇨⇨⇨quickbrownfox a⏎⏎⇨⏎a

The expected order appears in LÖVE 11.5:

11 5 12 0-dev-8951635

frank-f-trafton avatar Dec 07 '23 00:12 frank-f-trafton

I've noticed something of a pattern with this bug: The string will render normally up to the point that the first character requiring a fallback font is encountered, at which point it will render every character requiring the fallback at once before continuing on with what's left of the string. Not sure how much this discovery will help, and I haven't had an opportunity to test it with multiple fallback fonts just yet, though it's consistent with the behavior seen above.

Phoenix-Flare avatar May 22 '25 07:05 Phoenix-Flare

I found the problem.

void HarfbuzzShaper::computeBufferRanges(const ColoredCodepoints &codepoints, Range range, std::vector<BufferRange> &bufferranges)
{
	bufferranges.clear();

	if (codepoints.cps.size() == 0)
		return;

	// Less computation for the typical case (no fallback fonts).
	if (rasterizers.size() == 1)
	{
		hb_buffer_reset(hbBuffers[0]);
		hb_buffer_add_codepoints(hbBuffers[0], codepoints.cps.data(), codepoints.cps.size(), (unsigned int)range.getOffset(), (int)range.getSize());

		// TODO: Expose APIs for direction and script?
		hb_buffer_guess_segment_properties(hbBuffers[0]);

		hb_shape(hbFonts[0], hbBuffers[0], nullptr, 0);

		bufferranges.push_back({0, (int) range.first, Range(0, hb_buffer_get_length(hbBuffers[0]))});
		return;
	}

	std::vector<Range> fallbackranges = { range };

	// For each font, figure out the ranges of valid glyphs in the given string,
	// and add the rest to a list to be shaped by the next fallback font.
	// Harfbuzz doesn't have its own fallback API.
	for (size_t rasti = 0; rasti < rasterizers.size(); rasti++)
	{
		hb_buffer_t *hbb = hbBuffers[rasti];
		hb_buffer_reset(hbb);

		for (Range r : fallbackranges)
			hb_buffer_add_codepoints(hbb, codepoints.cps.data(), codepoints.cps.size(), (unsigned int)r.getOffset(), (int)r.getSize());

		hb_buffer_guess_segment_properties(hbb);

		hb_shape(hbFonts[rasti], hbb, nullptr, 0);

		size_t glyphcount = (size_t)hb_buffer_get_length(hbb);
		const hb_glyph_info_t *glyphinfos = hb_buffer_get_glyph_infos(hbb, nullptr);
		hb_direction_t direction = hb_buffer_get_direction(hbb);

		fallbackranges.clear();

		for (size_t i = 0; i < glyphcount; i++)
		{
			if (isValidGlyph(glyphinfos[i].codepoint, codepoints.cps, glyphinfos[i].cluster))
			{
				if (bufferranges.empty() || bufferranges.back().index != rasti || bufferranges.back().range.getMax() + 1 != i)
					bufferranges.push_back({rasti, (int)glyphinfos[i].cluster, Range(i, 1)});
				else
					bufferranges.back().range.last++;
			}
			else if (rasti == rasterizers.size() - 1)
			{
				// Use the first font for remaining invalid glyphs when no
				// fallback font supports them.
				if (bufferranges.empty() || bufferranges.back().index != 0 || bufferranges.back().range.getMax() + 1 != i)
					bufferranges.push_back({0, (int)glyphinfos[i].cluster, Range(i, 1)});
				else
					bufferranges.back().range.last++;
			}
			else
			{
				// Harfbuzz puts RTL text into the buffer in reverse order, so
				// it'll start with the last cluster (character index).
				if (fallbackranges.empty() || (direction == HB_DIRECTION_RTL ? fallbackranges.back().getMin() : fallbackranges.back().getMax()) != glyphinfos[i - 1].cluster)
					fallbackranges.push_back(Range(glyphinfos[i].cluster, 1));
				else
					fallbackranges.back().encapsulate(glyphinfos[i].cluster);
			}
		}
	}

	std::sort(bufferranges.begin(), bufferranges.end(), [](const BufferRange &a, const BufferRange &b)
	{
		if (a.codepointStart != b.codepointStart)
			return a.codepointStart < b.codepointStart;
		if (a.index != b.index)
			return a.index < b.index;
		return a.range.first < b.range.first;
	});
}

Here, Range(i,1) is weird, although I don't know what the true start of range should be.

iacore avatar Jun 27 '25 23:06 iacore

Well, there's no question that this code snippet is where the problem lies, but I think the issue isn't exclusively with the range there. Given what we know it seems to stem from the fact that it's iterating over the fonts one by one, which is fine for establishing what codepoints are valid in each, but evidently not so much for pushing text to be rendered to the buffer.

Phoenix-Flare avatar Jun 28 '25 12:06 Phoenix-Flare

Thread 1 "love" received signal SIGTRAP, Trace/breakpoint trap.
love::font::freetype::HarfbuzzShaper::computeBufferRanges (this=0x555557b72190, codepoints=..., 
    range=..., bufferranges=std::vector of length 4, capacity 4 = {...})
    at /home/user/computing/runtime/love/src/modules/font/freetype/HarfbuzzShaper.cpp:199
199		std::sort(bufferranges.begin(), bufferranges.end(), [](const BufferRange &a, const BufferRange &b)
(gdb) p bufferranges
$1 = std::vector of length 4, capacity 4 = {{index = 0, codepointStart = 0, range = {first = 0, 
      last = 0}}, {index = 0, codepointStart = 2, range = {first = 2, last = 2}}, {index = 0, 
    codepointStart = 4, range = {first = 4, last = 4}}, {index = 1, codepointStart = 1, range = {
      first = 0, last = 1}}}
...
(gdb) p bufferranges
$2 = std::vector of length 5, capacity 8 = {{index = 0, codepointStart = 0, range = {first = 0, 
      last = 2}}, {index = 0, codepointStart = 4, range = {first = 4, last = 8}}, {index = 0, 
    codepointStart = 10, range = {first = 10, last = 14}}, {index = 0, codepointStart = 16, range = {
      first = 16, last = 18}}, {index = 1, codepointStart = 3, range = {first = 0, last = 2}}}
...
(gdb) p bufferranges
$3 = std::vector of length 3, capacity 4 = {{index = 0, codepointStart = 0, range = {first = 0, 
      last = 0}}, {index = 0, codepointStart = 2, range = {first = 2, last = 2}}, {index = 1, 
    codepointStart = 1, range = {first = 0, last = 3}}}
(gdb) c
Continuing.

This correspond to the three lines.

Image

This is what it should look like:

Image

Note that all "tofu" is grouped into one cluster, which is weird. (where .first!=.last)

iacore avatar Jun 29 '25 13:06 iacore