Alberto
Alberto
I can give a partial answer because I'm in a hurry but thanks for the comment, i didn't test that. Using the same test i showed in the previous PR...
Random test fails aside (not caused by this PR) I've managed to improve performance by 15-20% by removing some unnecessary instructions and now the situation is much better compared to...
Anyway about this: > Also, tell me more about the changes. Did you do any algorithmic changes or is it just changes to increase the parallelism? The changes are mainly...
> Is this being worked on currently? If not, I'll give it a go. Is this to be written in C? Yes this is being worked on and was targeted...
You have to add a bunch of explicit casts: 
> I don't really understand the proposal, but piping in on this: > > > Create a register cache (of __m256i) and load the Surface’s pixels once. > > It...
I think we should also drop returning rects/ rects lists in functions like blit/blits since it was a thing only as an optimization for display.update(), but as @MyreMylar found out,...
Possibly we could also make all alphablit algorithms no longer behave as SDL1 (which to be fair it's kinda weird). This should be investigated further but i suspect it could...
Why not make these functions METH_FASTCALL? Instead of doing it in a separate PR doubling review time we should implement it here i think.
I'll be reverting the implementation in rect.collidepoint as an example because of the changes that were made to the rect module.