[12.0] love.graphics.print(): misordered characters when using fallback fonts
Version: 12.0-development (8951635)
Repro: 12-fallbacks.zip
Problem: When drawing text with love.graphics.print() using a font that has been configured with fallbacks, glyphs originating from the fallback fonts may appear in the wrong place.
Expected appearance of text, using NotoSans with NotoSansSymbol2 as a fallback:
a⇨b⇨c
The⇨quick⇨brown⇨fox
a⏎a⏎⇨⏎
Observed:
a⇨⇨bc
The⇨⇨⇨quickbrownfox
a⏎⏎⇨⏎a
The expected order appears in LÖVE 11.5:
I've noticed something of a pattern with this bug: The string will render normally up to the point that the first character requiring a fallback font is encountered, at which point it will render every character requiring the fallback at once before continuing on with what's left of the string. Not sure how much this discovery will help, and I haven't had an opportunity to test it with multiple fallback fonts just yet, though it's consistent with the behavior seen above.
I found the problem.
void HarfbuzzShaper::computeBufferRanges(const ColoredCodepoints &codepoints, Range range, std::vector<BufferRange> &bufferranges)
{
bufferranges.clear();
if (codepoints.cps.size() == 0)
return;
// Less computation for the typical case (no fallback fonts).
if (rasterizers.size() == 1)
{
hb_buffer_reset(hbBuffers[0]);
hb_buffer_add_codepoints(hbBuffers[0], codepoints.cps.data(), codepoints.cps.size(), (unsigned int)range.getOffset(), (int)range.getSize());
// TODO: Expose APIs for direction and script?
hb_buffer_guess_segment_properties(hbBuffers[0]);
hb_shape(hbFonts[0], hbBuffers[0], nullptr, 0);
bufferranges.push_back({0, (int) range.first, Range(0, hb_buffer_get_length(hbBuffers[0]))});
return;
}
std::vector<Range> fallbackranges = { range };
// For each font, figure out the ranges of valid glyphs in the given string,
// and add the rest to a list to be shaped by the next fallback font.
// Harfbuzz doesn't have its own fallback API.
for (size_t rasti = 0; rasti < rasterizers.size(); rasti++)
{
hb_buffer_t *hbb = hbBuffers[rasti];
hb_buffer_reset(hbb);
for (Range r : fallbackranges)
hb_buffer_add_codepoints(hbb, codepoints.cps.data(), codepoints.cps.size(), (unsigned int)r.getOffset(), (int)r.getSize());
hb_buffer_guess_segment_properties(hbb);
hb_shape(hbFonts[rasti], hbb, nullptr, 0);
size_t glyphcount = (size_t)hb_buffer_get_length(hbb);
const hb_glyph_info_t *glyphinfos = hb_buffer_get_glyph_infos(hbb, nullptr);
hb_direction_t direction = hb_buffer_get_direction(hbb);
fallbackranges.clear();
for (size_t i = 0; i < glyphcount; i++)
{
if (isValidGlyph(glyphinfos[i].codepoint, codepoints.cps, glyphinfos[i].cluster))
{
if (bufferranges.empty() || bufferranges.back().index != rasti || bufferranges.back().range.getMax() + 1 != i)
bufferranges.push_back({rasti, (int)glyphinfos[i].cluster, Range(i, 1)});
else
bufferranges.back().range.last++;
}
else if (rasti == rasterizers.size() - 1)
{
// Use the first font for remaining invalid glyphs when no
// fallback font supports them.
if (bufferranges.empty() || bufferranges.back().index != 0 || bufferranges.back().range.getMax() + 1 != i)
bufferranges.push_back({0, (int)glyphinfos[i].cluster, Range(i, 1)});
else
bufferranges.back().range.last++;
}
else
{
// Harfbuzz puts RTL text into the buffer in reverse order, so
// it'll start with the last cluster (character index).
if (fallbackranges.empty() || (direction == HB_DIRECTION_RTL ? fallbackranges.back().getMin() : fallbackranges.back().getMax()) != glyphinfos[i - 1].cluster)
fallbackranges.push_back(Range(glyphinfos[i].cluster, 1));
else
fallbackranges.back().encapsulate(glyphinfos[i].cluster);
}
}
}
std::sort(bufferranges.begin(), bufferranges.end(), [](const BufferRange &a, const BufferRange &b)
{
if (a.codepointStart != b.codepointStart)
return a.codepointStart < b.codepointStart;
if (a.index != b.index)
return a.index < b.index;
return a.range.first < b.range.first;
});
}
Here, Range(i,1) is weird, although I don't know what the true start of range should be.
Well, there's no question that this code snippet is where the problem lies, but I think the issue isn't exclusively with the range there. Given what we know it seems to stem from the fact that it's iterating over the fonts one by one, which is fine for establishing what codepoints are valid in each, but evidently not so much for pushing text to be rendered to the buffer.
Thread 1 "love" received signal SIGTRAP, Trace/breakpoint trap.
love::font::freetype::HarfbuzzShaper::computeBufferRanges (this=0x555557b72190, codepoints=...,
range=..., bufferranges=std::vector of length 4, capacity 4 = {...})
at /home/user/computing/runtime/love/src/modules/font/freetype/HarfbuzzShaper.cpp:199
199 std::sort(bufferranges.begin(), bufferranges.end(), [](const BufferRange &a, const BufferRange &b)
(gdb) p bufferranges
$1 = std::vector of length 4, capacity 4 = {{index = 0, codepointStart = 0, range = {first = 0,
last = 0}}, {index = 0, codepointStart = 2, range = {first = 2, last = 2}}, {index = 0,
codepointStart = 4, range = {first = 4, last = 4}}, {index = 1, codepointStart = 1, range = {
first = 0, last = 1}}}
...
(gdb) p bufferranges
$2 = std::vector of length 5, capacity 8 = {{index = 0, codepointStart = 0, range = {first = 0,
last = 2}}, {index = 0, codepointStart = 4, range = {first = 4, last = 8}}, {index = 0,
codepointStart = 10, range = {first = 10, last = 14}}, {index = 0, codepointStart = 16, range = {
first = 16, last = 18}}, {index = 1, codepointStart = 3, range = {first = 0, last = 2}}}
...
(gdb) p bufferranges
$3 = std::vector of length 3, capacity 4 = {{index = 0, codepointStart = 0, range = {first = 0,
last = 0}}, {index = 0, codepointStart = 2, range = {first = 2, last = 2}}, {index = 1,
codepointStart = 1, range = {first = 0, last = 3}}}
(gdb) c
Continuing.
This correspond to the three lines.
This is what it should look like:
Note that all "tofu" is grouped into one cluster, which is weird. (where .first!=.last)