fix #21435 Bad codegen when building dmd with optimizations
don't handle postinc/dec of *p++ again in cdeq(), it is already done in cdpost() and only modifies the temporary copy, sometimes even a different register.
Thanks for your pull request, @rainers!
Bugzilla references
Your PR doesn't reference any Bugzilla issue.
If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.
⚠️⚠️⚠️ Warnings ⚠️⚠️⚠️
- In preparation for migrating from Bugzilla to GitHub Issues, the issue reference syntax has changed. Please add the word "Bugzilla" to issue references. For example,
Fix Bugzilla Issue 12345orFix Bugzilla 12345.(Reminder: the edit needs to be done in the Git commit message, not the GitHub pull request.)
Testing this PR locally
If you don't have a local development environment setup, you can use Digger to test this PR:
dub run digger -- build "stable + dmd#21436"
It doesn't look like the test is being built with optimizations?
It doesn't look like the test is being built with optimizations?
Tests should be built with all combinations of -g -O -inline -release, all but -g seem to be necessary to trigger the bad register usage.
ping @WalterBright
I'll take a look at it shortly.
Link to the issue: https://github.com/dlang/dmd/issues/21435
@rainers The problem is definitely with the encoding of the SIB byte. There's a special case (!) for it when the reg field is 4 or 5. Just disabling the code generation is an incorrect fix, I will figure out the correct one. Thanks for helping me find this!
"For the special-case addressing encodings R/M=100 (to force a SIB byte), MOD=00 R/M=101 (substitute RIP+disp32), and MOD=00 R/M=100 BASE=101 (substitute disp32), the REX.B prefix is not considered.[2]: §1.8.2 These special-case encodings apply to registers R12 (binary 1100) and R13 (binary 1101) as well,[4] and the same slightly-longer encodings must be used. This is because these exceptions change the encoded instruction size, so must be decoded very quickly so that the following instruction may be located."
https://en.wikipedia.org/wiki/ModR/M#64-bit_changes
@rainers The problem is definitely with the encoding of the SIB byte. There's a special case (!) for it when the reg field is 4 or 5. Just disabling the code generation is an incorrect fix, I will figure out the correct one. Thanks for helping me find this!
@WalterBright Thanks for looking into this. While the SIB encoding might need some fix, too, the generated post increment seems gratuitous as it is done on a temporary. In the example above, the register that is used later on is rcx, not rbx.
The p is used in the next iteration of the loop, so the post increment is not spurious.
The p is used in the next iteration of the loop, so the post increment is not spurious.
p is kept in register rcx which is already incremented 2 instructions above.
What is the status of this? its it good to go?
I think it's superseded by https://github.com/dlang/dmd/pull/21480
Yeah, after #21480 is merged this is a micro optimization not really important. There are many more unnecessary instructions in dmd's code gen (why use another register to begin with?).