dmd icon indicating copy to clipboard operation
dmd copied to clipboard

fix #21435 Bad codegen when building dmd with optimizations

Open rainers opened this issue 7 months ago • 3 comments

don't handle postinc/dec of *p++ again in cdeq(), it is already done in cdpost() and only modifies the temporary copy, sometimes even a different register.

rainers avatar Jun 08 '25 18:06 rainers

Thanks for your pull request, @rainers!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

⚠️⚠️⚠️ Warnings ⚠️⚠️⚠️

  • In preparation for migrating from Bugzilla to GitHub Issues, the issue reference syntax has changed. Please add the word "Bugzilla" to issue references. For example, Fix Bugzilla Issue 12345 or Fix Bugzilla 12345.(Reminder: the edit needs to be done in the Git commit message, not the GitHub pull request.)

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "stable + dmd#21436"

dlang-bot avatar Jun 08 '25 18:06 dlang-bot

It doesn't look like the test is being built with optimizations?

Herringway avatar Jun 08 '25 18:06 Herringway

It doesn't look like the test is being built with optimizations?

Tests should be built with all combinations of -g -O -inline -release, all but -g seem to be necessary to trigger the bad register usage.

rainers avatar Jun 08 '25 19:06 rainers

ping @WalterBright

thewilsonator avatar Jun 23 '25 00:06 thewilsonator

I'll take a look at it shortly.

WalterBright avatar Jun 24 '25 17:06 WalterBright

Link to the issue: https://github.com/dlang/dmd/issues/21435

WalterBright avatar Jun 25 '25 01:06 WalterBright

@rainers The problem is definitely with the encoding of the SIB byte. There's a special case (!) for it when the reg field is 4 or 5. Just disabling the code generation is an incorrect fix, I will figure out the correct one. Thanks for helping me find this!

WalterBright avatar Jun 25 '25 03:06 WalterBright

"For the special-case addressing encodings R/M=100 (to force a SIB byte), MOD=00 R/M=101 (substitute RIP+disp32), and MOD=00 R/M=100 BASE=101 (substitute disp32), the REX.B prefix is not considered.[2]: §1.8.2  These special-case encodings apply to registers R12 (binary 1100) and R13 (binary 1101) as well,[4] and the same slightly-longer encodings must be used. This is because these exceptions change the encoded instruction size, so must be decoded very quickly so that the following instruction may be located."

https://en.wikipedia.org/wiki/ModR/M#64-bit_changes

WalterBright avatar Jun 25 '25 03:06 WalterBright

@rainers The problem is definitely with the encoding of the SIB byte. There's a special case (!) for it when the reg field is 4 or 5. Just disabling the code generation is an incorrect fix, I will figure out the correct one. Thanks for helping me find this!

@WalterBright Thanks for looking into this. While the SIB encoding might need some fix, too, the generated post increment seems gratuitous as it is done on a temporary. In the example above, the register that is used later on is rcx, not rbx.

rainers avatar Jun 25 '25 06:06 rainers

The p is used in the next iteration of the loop, so the post increment is not spurious.

WalterBright avatar Jun 25 '25 23:06 WalterBright

The p is used in the next iteration of the loop, so the post increment is not spurious.

p is kept in register rcx which is already incremented 2 instructions above.

rainers avatar Jun 26 '25 05:06 rainers

What is the status of this? its it good to go?

thewilsonator avatar Jul 05 '25 10:07 thewilsonator

I think it's superseded by https://github.com/dlang/dmd/pull/21480

dkorpel avatar Jul 05 '25 10:07 dkorpel

Yeah, after #21480 is merged this is a micro optimization not really important. There are many more unnecessary instructions in dmd's code gen (why use another register to begin with?).

rainers avatar Jul 06 '25 09:07 rainers