zapret
zapret copied to clipboard
Please help fix the GCC bug
I notice this commit: https://github.com/bol-van/zapret/commit/9402cd2cf0a16352a3401f7bce1a894bf131138b
This seems to be an important gcc bug. I have filed a bug report to gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105884
Please provide more details, so that we can fix this gcc bug.
Despite of Jonathan Wakely's suggestions I refused to use his memcpy solution. I like the compiler to do what I write, not rely on its version/arch specific optimizer's logic. I analyzed what code is produced on x86_64, arm64, mips64, mips32, arm32. Code is quite good on 64-bit arch's and quite bad on 32 bit archs. On some 32-bit archs compiler even directly calls memcpy's. That's bad
Instead I added this : attribute((optimize ("no-strict-aliasing"))) to avoid possible miscompiles
Also added use of __int128 type if available. It helps on some archs (arm64) to improve the code. Although I could not manage to force compiler use of xmm registers on x86_64 arch this way. Only GCC 12 produces xmm code in memcpy version. Gcc 11 does not use xmm even in memcpy version