stb icon indicating copy to clipboard operation
stb copied to clipboard

[stb_truetype, MacOs] Heap buffer overflow in font Times.ttc stbtt_GetGlyphShape(glyph_index=38)

Open Jony01 opened this issue 3 years ago • 9 comments

Describe the bug I got "Heap buffer overflow" whenI access to glyph shape by function stbtt_GetGlyphShape glyph_index = 38 (code point 67, It should be latin capital letter 'C') on MacOs.

Already tested by myself When I do the same (with copied font from MacOs) on the Window everything works fine. Issue is somehow connected with Operating System.

To Reproduce

  1. Use font Times.ttc on MacOs (choose from Os //System/Library/Fonts/Times.ttc or use attached ".zip" file after extracting).
  2. Open font file by stbtt library
  3. Use function stbtt_GetGlyphShape with glyph_index 38 (not only this one)
  4. See error> "Heap buffer overflow"
// Minimal test to reproduce
TEST_CASE("Times MacOs", "") {
    std::string font_path = "//System/Library/Fonts/Times.ttc";
    char  letter   = 'C';
    float flatness = 2.;
    FILE *file = fopen(font_path.c_str(), "rb");
    REQUIRE(file != nullptr);
    // find size of file
    REQUIRE(fseek(file, 0L, SEEK_END) == 0);
    size_t size = ftell(file);
    REQUIRE(size != 0);
    rewind(file);
    std::vector<unsigned char> buffer(size);
    size_t count_loaded_bytes = fread((void *) &buffer.front(), 1, size, file);
    REQUIRE(count_loaded_bytes == size);
    int font_offset = stbtt_GetFontOffsetForIndex(buffer.data(), 0);
    REQUIRE(font_offset >= 0);
    stbtt_fontinfo font_info;
    REQUIRE(stbtt_InitFont(&font_info, buffer.data(), font_offset) != 0);    
    int unicode_letter = (int) letter;
    int glyph_index = stbtt_FindGlyphIndex(&font_info, unicode_letter);
    REQUIRE(glyph_index != 0);
    stbtt_vertex *vertices;
    int num_verts = stbtt_GetGlyphShape(&font_info, glyph_index, &vertices);
    CHECK(num_verts > 0);
}

Expected behavior Earn glyph shape.

Attachment Times.zip unnamed (5)

Jony01 avatar Feb 23 '22 08:02 Jony01

Code for allocation vertices(few lines before printScreen):

m = n + 2*numberOfContours;  // a loose bound on how many vertices we might need
vertices = (stbtt_vertex *) STBTT_malloc(m * sizeof(vertices[0]), info->userdata);

In this case: m = 41 off + i +1 = 41 In condition is touching not allocated memory

I create quick fix of it by adding +1 to allocation size. vertices = (stbtt_vertex *) STBTT_malloc((m+1) * sizeof(vertices[0]), info->userdata);

It seems to work for me. Could you confirm FIX?

Jony01 avatar Feb 23 '22 13:02 Jony01

at least it makes sense.

an adding 1 is a good demonstration that the issue is the allocation is too small (and it probably works on windows because the windows allocator is rounding up and giving us extra space), but a proper fix needs to figure out why the m calculation is not a conservative bound and figure out what the correct conservative bound is, so we don't just have a case in 1.5 years where somebody has to bump it to +2.

(Of course maybe the correct conservative bound is m+1, but we have to actually figure it out and be sure.)

nothings avatar Feb 23 '22 14:02 nothings

FYI: On windows is situation with allocation size and accessing unallocated memory the same. But windows compiler is not soo strict.

Jony01 avatar Feb 23 '22 16:02 Jony01

Is there somebody, who investigate the issue? Or should I go deeper to find a correct fix (without memory leaks).

Jony01 avatar Mar 07 '22 15:03 Jony01

@nothings : I agree with you, that is necessary to

figure out what the correct conservative bound is

@nothings are you still investigating this issue?

Or is there somebody, who could care about conservative bound?

Jony01 avatar Mar 18 '22 08:03 Jony01

Or is there somebody, who could care about conservative bound?

I haven't looked too deeply into this, but my 2c on this is that the allocation size itself might not be the problem here to begin with, and simply adding +1 to it just masks the problem.

The OOB access happens here, Line 1777:

https://github.com/nothings/stb/blob/af1a5bc352164740c1cc1354942b1c6b72eacb8a/stb_truetype.h#L1772-L1790

which is inside this loop, notice the i < n condition:

https://github.com/nothings/stb/blob/af1a5bc352164740c1cc1354942b1c6b72eacb8a/stb_truetype.h#L1761

So once again, without looking too deep into what the code is actually doing, it very well might be that the proper fix is to make sure i < n-1 before entering the if (start_off) branch since simply allocating +1 "fixes" the OOB access, but now (potentially) creates a new problem of reading from uninitialized data.

N-R-K avatar Mar 18 '22 09:03 N-R-K

You suggest change line 1771 from: https://github.com/nothings/stb/blob/af1a5bc352164740c1cc1354942b1c6b72eacb8a/stb_truetype.h#L1770-L1772 to: start_off = !(flags & 1) && (i+1) < n;

?

I dumped vertices data from problematic letter 'C' inside of font Times.ttf Original - Before change

v(x=986,        y=1346,         cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☺,         padding=═)
v(x=1117,       y=1313,         cx=1103,        cy=1313,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1172,       y=1329,         cx=1146,        cy=1313,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1208,       y=1379,         cx=1198,        cy=1345,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1251,       y=1379,         cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☻,         padding=═)
v(x=1270,       y=919,          cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☻,         padding=═)
v(x=1223,       y=919,          cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☻,         padding=═)
v(x=1116,       y=1140,         cx=1182,        cy=1056,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=777,        y=1298,         cx=990,         cy=1298,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=425,        y=1128,         cx=562,         cy=1298,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=289,        y=660,          cx=289,         cy=958,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=432,        y=222,          cx=289,         cy=386,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=796,        y=59,           cx=576,         cy=59,          cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1089,       y=135,          cx=955,         cy=59,          cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1258,       y=265,          cx=1166,        cy=178,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1296,       y=227,          cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☻,         padding=═)
v(x=1127,       y=76,           cx=1228,        cy=141,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=754,        y=-34,          cx=955,         cy=-34,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=230,        y=177,          cx=424,         cy=-34,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=51,         y=664,          cx=51,          cy=372,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=240,        y=1166,         cx=51,          cy=963,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=758,        y=1379,         cx=439,         cy=1379,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=986,        y=1346,         cx=869,         cy=1379,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=-515,       y=-515,         cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☺,         padding=═)
v(x=-515,       y=-515,         cx=751,         cy=1379,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)

After change

v(x=986,        y=1346,         cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☺,         padding=═)
v(x=1117,       y=1313,         cx=1103,        cy=1313,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1172,       y=1329,         cx=1146,        cy=1313,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1208,       y=1379,         cx=1198,        cy=1345,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1251,       y=1379,         cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☻,         padding=═)
v(x=1270,       y=919,          cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☻,         padding=═)
v(x=1223,       y=919,          cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☻,         padding=═)
v(x=1116,       y=1140,         cx=1182,        cy=1056,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=777,        y=1298,         cx=990,         cy=1298,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=425,        y=1128,         cx=562,         cy=1298,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=289,        y=660,          cx=289,         cy=958,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=432,        y=222,          cx=289,         cy=386,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=796,        y=59,           cx=576,         cy=59,          cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1089,       y=135,          cx=955,         cy=59,          cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1258,       y=265,          cx=1166,        cy=178,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=1296,       y=227,          cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☻,         padding=═)
v(x=1127,       y=76,           cx=1228,        cy=141,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=754,        y=-34,          cx=955,         cy=-34,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=230,        y=177,          cx=424,         cy=-34,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=51,         y=664,          cx=51,          cy=372,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=240,        y=1166,         cx=51,          cy=963,         cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=758,        y=1379,         cx=439,         cy=1379,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=986,        y=1346,         cx=869,         cy=1379,        cx1=-12851,     cy1=-12851,     type=♥,         padding=═)
v(x=751,        y=1379,         cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☺,         padding=═)
v(x=751,        y=1379,         cx=0,           cy=0,           cx1=-12851,     cy1=-12851,     type=☻,         padding=═)

So difference is only in last two point. Coordinates x , y , cx and cy . But I can't say that before change it was correct.

I got data to dump by function stbtt_GetGlyphShape as described in issue reproduction part. and print function:

    for (int i = 0; i < num_verts; i++) {
        const auto& v = vertices[i];
        std::cout << "v("
                  << "x=" << v.x << ",   \t"
                  << "y=" << v.y << ",   \t"
                  << "cx=" << v.cx << ",   \t"
                  << "cy=" << v.cy << ",   \t"
                  << "cx1=" << v.cx1 << ",   \t"
                  << "cy1=" << v.cy1 << ",   \t"
                  << "type=" << v.type << ",   \t"
                  << "padding=" << v.padding << ") " << std::endl;
    }

Jony01 avatar Mar 18 '22 11:03 Jony01

I was just pointing one some suspects, wasn't a suggestion. From my point of view, aka someone who doesn't know what the code is doing or what the "correct" output should be; one other suspect might be flags getting incorrectly set.

Once again, I'm not sure where the error actually is, I'm simply pointing out some suspects. Someone who actually understands the code should probably take a look into this.

N-R-K avatar Mar 18 '22 11:03 N-R-K

Ok, this was barking up the wrong tree because I didn't look at the code. The code is iterating over the array i=[0..n), with off chosen so that 'off+i' is always in [0..m). The problem is that it's looking ahead one point, at 'off+i+1' (you can see this in the screenshot). Fonts do this all the time, but it causes a problem if it's done when i+1=n. And most fonts don't ever do this, because it makes no sense. But random fonts sometimes have random dumb crap in them that have no effect and need to be handled correctly.

To spell out what this font does: conceptually, font shapes are defined as a sequence of 'moveto', 'lineto', and 'curveto' operations. But actually in truetype you get a series of points, each flagged as on the curve or off the curve, where curve really means shape. Conceptually, if two in a row are on the curve, they are a line segment; if there's an intervening off-curve point, then it's a quadratic bezier with the off-curve as a control point. (There's another case for if two in a row are off the curve, which does not produce a cubic bezier, but the details are irrelevant here.)

Rather than being flagged as 'moveto', the index of the moveto operations are stored in a separate array. Because move operations are indexed separately, they are otherwise normal points, so they still get a flag for being on-curve or off-curve. If the move operation (known in the truetype specification as the first point in a contour) is off-curve, we look ahead one point to see how to handle the off-curve point. We're looking at point off+i, so this makes us fetch off+i+1. So, if the LAST point in the list of all points (i=n-1) is listed in the 'move' array, and that point is tagged as off-curve, then we will try to access off+i+1 which is off+n which is m. This causes us to access array element m, which is one past the end of the array.

All of this is entirely pointless; since there are no points after this we will never actually produce a real shape, and this vertex will eventually be discarded. So it is probably completely functional to simply allocate the array to m+1 and read garbage (that later gets discarded), but it is probably better to guard the whole section, replacing if (start_off) with if (start_off && (i+1)<n). None of this code existed in the original implementation, which didn't handle off-curve moves at all, so clearly when it was added, this wasn't thought through.

This will be fixed properly in the next version, but you can work around it either of the above ways while waiting.

nothings avatar Mar 18 '22 11:03 nothings