pygame-ce icon indicating copy to clipboard operation
pygame-ce copied to clipboard

Fix segfault with antialiased draw functions with a depth different than 32bits

Open bilhox opened this issue 1 year ago • 7 comments
trafficstars

Closes #586

This segfault also happens with antialiased functions from pygame.draw.

Needs some tests from other users

bilhox avatar Jul 19 '24 18:07 bilhox

edit : It doesn't fix gfxdraw side right now as it's meant to be deprecated and be redirected to pygame.draw for all antialiasing functions. This is why I consider it fixes the issue for gfxdraw, see #3005 for more informations.

bilhox avatar Jul 19 '24 21:07 bilhox

This whole thing is a mess.

Firstly, I would expect in a PR like this to see an explanation of what is actually being fixed, why it is segfaulting. Not just that it's fixed and needs tests from other users.

You want to use SDL_GetRGBA correctly? See https://github.com/libsdl-org/SDL/issues/8320 and https://github.com/libsdl-org/SDL/commit/f8dfee01bb4fdca3d7f77d4d2a30ed58c63e869c#diff-ec00a797fe34913f3e553761506783ef6b4cd0050704665615c93392ac7360e6R1557-R1618

They mention the performance is terrible "meant only for unittests", so that gets me thinking about performance:

So get_antialiased_color is a performance sensitive function, why is it being called over and over decoding the value of original_color every single time?

And so then this function produces a result to use in set_at, but set_at also calls SDL_GetRGB, but only on 24 bit! So in set_at it's taking a mapped color, unmapping it with SDL_GetRGB, then manually remapping it with bit shifting. Am I reading that right? (EDIT: PR to fix set_at to not call SDL_GetRGB every pixel: #3021)

P.S. maybe we can do our version of SDL_GetRGBA "manually inlined" into our source by grabbing it from the SDL source.

Starbuck5 avatar Jul 26 '24 05:07 Starbuck5

This whole thing is a mess.

Firstly, I would expect in a PR like this to see an explanation of what is actually being fixed, why it is segfaulting. Not just that it's fixed and needs tests from other users.

You want to use SDL_GetRGBA correctly? See libsdl-org/SDL#8320 and libsdl-org/SDL@f8dfee0#diff-ec00a797fe34913f3e553761506783ef6b4cd0050704665615c93392ac7360e6R1557-R1618

They mention the performance is terrible "meant only for unittests", so that gets me thinking about performance:

So get_antialiased_color is a performance sensitive function, why is it being called over and over decoding the value of original_color every single time?

And so then this function produces a result to use in set_at, but set_at also calls SDL_GetRGB, but only on 24 bit! So in set_at it's taking a mapped color, unmapping it with SDL_GetRGB, then manually remapping it with bit shifting. Am I reading that right? (EDIT: PR to fix set_at to not call SDL_GetRGB every pixel: #3021)

P.S. maybe we can do our version of SDL_GetRGBA "manually inlined" into our source by grabbing it from the SDL source.

Hello, first, thank you for the constructive review !

Sorry, I forgot to mention what was the culprit behind this segfault. As you guessed, the problem resided on how we were extracting pixels in get_antialiased_color in draw.c (extracted as a Uint32 array), leading to read at pixels out of the range of the array.

You suggested me to look at how SDL does the same work with SDL_ReadSurfacePixel, are you specifically talking about 24bit path ?

bilhox avatar Jul 26 '24 18:07 bilhox

You suggested me to look at how SDL does the same work with SDL_ReadSurfacePixel, are you specifically talking about 24bit path ?

So I see you checked out the SDL implementation and put it in, nice! The big endian build is failing though since you have a discrepancy in variable names.

I found another way to get around this problem that may be better. When I looked through draw.c I saw that this is the only location that does any pixel reading, everything else is just doing pixel writing. But there are places in pygame-ce doing pixel reading, how are they doing it? I found in surface.h there's a macro GET_PIXEL that works for all bit depths, that could be a good solution here. Might even be faster, as it doesn't need a memcpy-- although the memcpy might be compiler-optimized-away.

Anyways, I'd be interested in seeing you check it out.

Starbuck5 avatar Jul 27 '24 06:07 Starbuck5

I checked how it's done on other files, and I saw that they all used a switch case, and manually shift bits for 3 bytes case, and ultimately call GET_PIXELVAL which is a hidden SDL_GetRGBA.

bilhox avatar Jul 27 '24 10:07 bilhox

I checked how it's done on other files, and I saw that they all used a switch case, and manually shift bits for 3 bytes case, and ultimately call GET_PIXELVAL which is a hidden SDL_GetRGBA.

Yes, I wasn't saying it avoided calling that function? Just saying that we already have an implementation strategy for calling that function properly, maybe that's the way to go here as well.

Starbuck5 avatar Aug 04 '24 06:08 Starbuck5

What about adding a new C API function called pg_getPixelValue ? Everywhere they have their own macros.

bilhox avatar Aug 06 '24 07:08 bilhox

To move this fix as close as possible to being merged. I'm doing some performance tests (just a little skeptical about the perf behind a lot of memcpy calls) which I'll be posting here tomorrow. This will determine if we keep the switch case impl or the memcpy impl.

bilhox avatar Aug 23 '24 21:08 bilhox

My tests on aaline showed me this : graph_get_pixel_value

red is the switch case implementation and green is the memcpy implementation. As you can see switch case implementation is faster than memcpy. So I'll be reverting to switch case. (The results for other bpp show the same difference of performance)

bilhox avatar Aug 24 '24 09:08 bilhox