stb
stb copied to clipboard
[stb_truetype, MacOs] Heap buffer overflow in font Times.ttc stbtt_GetGlyphShape(glyph_index=38)
Describe the bug
I got "Heap buffer overflow" whenI access to glyph shape by function stbtt_GetGlyphShape
glyph_index = 38 (code point 67, It should be latin capital letter 'C') on MacOs.
Already tested by myself When I do the same (with copied font from MacOs) on the Window everything works fine. Issue is somehow connected with Operating System.
To Reproduce
- Use font
Times.ttc
on MacOs (choose from Os//System/Library/Fonts/Times.ttc
or use attached ".zip" file after extracting). - Open font file by stbtt library
- Use function
stbtt_GetGlyphShape
with glyph_index 38 (not only this one) - See error> "Heap buffer overflow"
// Minimal test to reproduce
TEST_CASE("Times MacOs", "") {
std::string font_path = "//System/Library/Fonts/Times.ttc";
char letter = 'C';
float flatness = 2.;
FILE *file = fopen(font_path.c_str(), "rb");
REQUIRE(file != nullptr);
// find size of file
REQUIRE(fseek(file, 0L, SEEK_END) == 0);
size_t size = ftell(file);
REQUIRE(size != 0);
rewind(file);
std::vector<unsigned char> buffer(size);
size_t count_loaded_bytes = fread((void *) &buffer.front(), 1, size, file);
REQUIRE(count_loaded_bytes == size);
int font_offset = stbtt_GetFontOffsetForIndex(buffer.data(), 0);
REQUIRE(font_offset >= 0);
stbtt_fontinfo font_info;
REQUIRE(stbtt_InitFont(&font_info, buffer.data(), font_offset) != 0);
int unicode_letter = (int) letter;
int glyph_index = stbtt_FindGlyphIndex(&font_info, unicode_letter);
REQUIRE(glyph_index != 0);
stbtt_vertex *vertices;
int num_verts = stbtt_GetGlyphShape(&font_info, glyph_index, &vertices);
CHECK(num_verts > 0);
}
Expected behavior Earn glyph shape.
Attachment
Times.zip
Code for allocation vertices(few lines before printScreen):
m = n + 2*numberOfContours; // a loose bound on how many vertices we might need
vertices = (stbtt_vertex *) STBTT_malloc(m * sizeof(vertices[0]), info->userdata);
In this case: m = 41 off + i +1 = 41 In condition is touching not allocated memory
I create quick fix of it by adding +1 to allocation size.
vertices = (stbtt_vertex *) STBTT_malloc((m+1) * sizeof(vertices[0]), info->userdata);
It seems to work for me. Could you confirm FIX?
at least it makes sense.
an adding 1 is a good demonstration that the issue is the allocation is too small (and it probably works on windows because the windows allocator is rounding up and giving us extra space), but a proper fix needs to figure out why the m
calculation is not a conservative bound and figure out what the correct conservative bound is, so we don't just have a case in 1.5 years where somebody has to bump it to +2.
(Of course maybe the correct conservative bound is m+1
, but we have to actually figure it out and be sure.)
FYI: On windows is situation with allocation size and accessing unallocated memory the same. But windows compiler is not soo strict.
Is there somebody, who investigate the issue? Or should I go deeper to find a correct fix (without memory leaks).
@nothings : I agree with you, that is necessary to
figure out what the correct conservative bound is
@nothings are you still investigating this issue?
Or is there somebody, who could care about conservative bound?
Or is there somebody, who could care about conservative bound?
I haven't looked too deeply into this, but my 2c on this is that the allocation size itself might not be the problem here to begin with, and simply adding +1
to it just masks the problem.
The OOB access happens here, Line 1777:
https://github.com/nothings/stb/blob/af1a5bc352164740c1cc1354942b1c6b72eacb8a/stb_truetype.h#L1772-L1790
which is inside this loop, notice the i < n
condition:
https://github.com/nothings/stb/blob/af1a5bc352164740c1cc1354942b1c6b72eacb8a/stb_truetype.h#L1761
So once again, without looking too deep into what the code is actually doing, it very well might be that the proper fix is to make sure i < n-1
before entering the if (start_off)
branch since simply allocating +1
"fixes" the OOB access, but now (potentially) creates a new problem of reading from uninitialized data.
You suggest change line 1771 from:
https://github.com/nothings/stb/blob/af1a5bc352164740c1cc1354942b1c6b72eacb8a/stb_truetype.h#L1770-L1772
to:
start_off = !(flags & 1) && (i+1) < n;
?
I dumped vertices data from problematic letter 'C' inside of font Times.ttf Original - Before change
v(x=986, y=1346, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☺, padding=═)
v(x=1117, y=1313, cx=1103, cy=1313, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1172, y=1329, cx=1146, cy=1313, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1208, y=1379, cx=1198, cy=1345, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1251, y=1379, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☻, padding=═)
v(x=1270, y=919, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☻, padding=═)
v(x=1223, y=919, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☻, padding=═)
v(x=1116, y=1140, cx=1182, cy=1056, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=777, y=1298, cx=990, cy=1298, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=425, y=1128, cx=562, cy=1298, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=289, y=660, cx=289, cy=958, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=432, y=222, cx=289, cy=386, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=796, y=59, cx=576, cy=59, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1089, y=135, cx=955, cy=59, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1258, y=265, cx=1166, cy=178, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1296, y=227, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☻, padding=═)
v(x=1127, y=76, cx=1228, cy=141, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=754, y=-34, cx=955, cy=-34, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=230, y=177, cx=424, cy=-34, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=51, y=664, cx=51, cy=372, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=240, y=1166, cx=51, cy=963, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=758, y=1379, cx=439, cy=1379, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=986, y=1346, cx=869, cy=1379, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=-515, y=-515, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☺, padding=═)
v(x=-515, y=-515, cx=751, cy=1379, cx1=-12851, cy1=-12851, type=♥, padding=═)
After change
v(x=986, y=1346, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☺, padding=═)
v(x=1117, y=1313, cx=1103, cy=1313, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1172, y=1329, cx=1146, cy=1313, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1208, y=1379, cx=1198, cy=1345, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1251, y=1379, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☻, padding=═)
v(x=1270, y=919, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☻, padding=═)
v(x=1223, y=919, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☻, padding=═)
v(x=1116, y=1140, cx=1182, cy=1056, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=777, y=1298, cx=990, cy=1298, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=425, y=1128, cx=562, cy=1298, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=289, y=660, cx=289, cy=958, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=432, y=222, cx=289, cy=386, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=796, y=59, cx=576, cy=59, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1089, y=135, cx=955, cy=59, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1258, y=265, cx=1166, cy=178, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=1296, y=227, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☻, padding=═)
v(x=1127, y=76, cx=1228, cy=141, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=754, y=-34, cx=955, cy=-34, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=230, y=177, cx=424, cy=-34, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=51, y=664, cx=51, cy=372, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=240, y=1166, cx=51, cy=963, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=758, y=1379, cx=439, cy=1379, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=986, y=1346, cx=869, cy=1379, cx1=-12851, cy1=-12851, type=♥, padding=═)
v(x=751, y=1379, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☺, padding=═)
v(x=751, y=1379, cx=0, cy=0, cx1=-12851, cy1=-12851, type=☻, padding=═)
So difference is only in last two point. Coordinates x
, y
, cx
and cy
. But I can't say that before change it was correct.
I got data to dump by function stbtt_GetGlyphShape
as described in issue reproduction part.
and print function:
for (int i = 0; i < num_verts; i++) {
const auto& v = vertices[i];
std::cout << "v("
<< "x=" << v.x << ", \t"
<< "y=" << v.y << ", \t"
<< "cx=" << v.cx << ", \t"
<< "cy=" << v.cy << ", \t"
<< "cx1=" << v.cx1 << ", \t"
<< "cy1=" << v.cy1 << ", \t"
<< "type=" << v.type << ", \t"
<< "padding=" << v.padding << ") " << std::endl;
}
I was just pointing one some suspects, wasn't a suggestion. From my point of view, aka someone who doesn't know what the code is doing or what the "correct" output should be; one other suspect might be flags
getting incorrectly set.
Once again, I'm not sure where the error actually is, I'm simply pointing out some suspects. Someone who actually understands the code should probably take a look into this.
Ok, this was barking up the wrong tree because I didn't look at the code. The code is iterating over the array i
=[0..n), with off
chosen so that 'off+i' is always in [0..m). The problem is that it's looking ahead one point, at 'off+i+1' (you can see this in the screenshot). Fonts do this all the time, but it causes a problem if it's done when i+1
=n
. And most fonts don't ever do this, because it makes no sense. But random fonts sometimes have random dumb crap in them that have no effect and need to be handled correctly.
To spell out what this font does: conceptually, font shapes are defined as a sequence of 'moveto', 'lineto', and 'curveto' operations. But actually in truetype you get a series of points, each flagged as on the curve or off the curve, where curve really means shape. Conceptually, if two in a row are on the curve, they are a line segment; if there's an intervening off-curve point, then it's a quadratic bezier with the off-curve as a control point. (There's another case for if two in a row are off the curve, which does not produce a cubic bezier, but the details are irrelevant here.)
Rather than being flagged as 'moveto', the index of the moveto operations are stored in a separate array. Because move operations are indexed separately, they are otherwise normal points, so they still get a flag for being on-curve or off-curve. If the move operation (known in the truetype specification as the first point in a contour) is off-curve, we look ahead one point to see how to handle the off-curve point. We're looking at point off+i
, so this makes us fetch off+i+1
. So, if the LAST point in the list of all points (i
=n
-1) is listed in the 'move' array, and that point is tagged as off-curve, then we will try to access off+i+1
which is off+n
which is m
. This causes us to access array element m
, which is one past the end of the array.
All of this is entirely pointless; since there are no points after this we will never actually produce a real shape, and this vertex will eventually be discarded. So it is probably completely functional to simply allocate the array to m+1
and read garbage (that later gets discarded), but it is probably better to guard the whole section, replacing if (start_off)
with if (start_off && (i+1)<n)
. None of this code existed in the original implementation, which didn't handle off-curve moves at all, so clearly when it was added, this wasn't thought through.
This will be fixed properly in the next version, but you can work around it either of the above ways while waiting.