Simplify endian conversion functions in std.bitmanip.
They're overly complex (e.g. testing for the native endianness of the machine when that's actually completely unnecessary), and they use unions in a manner that is undefined behavior in C/C++ (since they write to one field and then read from another). I don't see any mention of whether that behavior is defined in D in D's spec, but it's probably undefined in D given that it's undefined in C/C++. Either way, it's an overly complex solution.
This solution gets rid of all of the endian version checks in std.bitmanip, and it allows the implementations of the endian conversion functions to be the same between CTFE and runtime.
@atilaneves Assuming that I didn't screw up the implementation somehow, this should be much more to your liking than what's been here. Either way, it results in a lot less code to be screwed up. :)
Thanks for your pull request, @jmdavis!
Bugzilla references
Your PR doesn't reference any Bugzilla issue.
If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.
Testing this PR locally
If you don't have a local development environment setup, you can use Digger to test this PR:
dub run digger -- build "master + phobos#8822"
If I screwed up anything, it would probably be the floating point stuff, but it passes the tests (at least on my machine).
I find it somewhat suspicious that I had to change @safe to @trusted on the endian conversion functions so that the compilation for vibe-d in buildkite would stop failing due to complaining due to taking the address of a local variable given that the same instantiation succeeded on my machine, but whatever. At one point time, that definitely would have been @system, so it's probably some difference in compilation flags related to DIP 1000.
I dislike that this PR makes DMD's output much worse without giving any user-visible benefit elsewhere. Prior to this PR DMD does the same thing as LDC.
assume CS:.text._D3std8bitmanip__T20nativeToLittleEndianTmZQzFNaNbNiNfxmZG8h
push RBP
mov RBP,RSP
mov RAX,RDI
pop RBP
ret
assume CS:.text._D3std8bitmanip__T17nativeToBigEndianTmZQwFNaNbNiNfxmZG8h
push RBP
mov RBP,RSP
mov RAX,RDI
bswap EAX
pop RBP
ret
Post this PR it does not.
assume CS:.text._D3std8bitmanip__T20nativeToLittleEndianTmZQzFNaNbNiNexmZG8h
push RBP
mov RBP,RSP
sub RSP,020h
mov -018h[RBP],RBX
mov -8[RBP],RDI
mov qword ptr -010h[RBP],0
mov -010h[RBP],DIL
shr RDI,8
mov -0Fh[RBP],DIL
mov RAX,-8[RBP]
shr RAX,010h
mov -0Eh[RBP],AL
mov RCX,-8[RBP]
shr RCX,018h
mov -0Dh[RBP],CL
mov DL,-4[RBP]
mov -0Ch[RBP],DL
mov RBX,-8[RBP]
shr RBX,028h
mov -0Bh[RBP],BL
mov RSI,-8[RBP]
shr RSI,030h
mov -0Ah[RBP],SIL
mov RAX,-8[RBP]
shr RAX,038h
mov -9[RBP],AL
mov RAX,-010h[RBP]
mov RBX,-018h[RBP]
mov RSP,RBP
pop RBP
ret
assume CS:.text._D3std8bitmanip__T17nativeToBigEndianTmZQwFNaNbNiNexmZG8h
push RBP
mov RBP,RSP
sub RSP,020h
mov -018h[RBP],RBX
mov -8[RBP],RDI
mov qword ptr -010h[RBP],0
shr RDI,038h
mov -010h[RBP],DIL
mov RAX,-8[RBP]
shr RAX,030h
mov -0Fh[RBP],AL
mov RCX,-8[RBP]
shr RCX,028h
mov -0Eh[RBP],CL
mov DL,-4[RBP]
mov -0Dh[RBP],DL
mov RBX,-8[RBP]
shr RBX,018h
mov -0Ch[RBP],BL
mov RSI,-8[RBP]
shr RSI,010h
mov -0Bh[RBP],SIL
mov RAX,-8[RBP]
shr RAX,8
mov -0Ah[RBP],AL
mov CL,-8[RBP]
mov -9[RBP],CL
mov RAX,-010h[RBP]
mov RBX,-018h[RBP]
mov RSP,RBP
pop RBP
ret
I'm only half-joking when I say mentioning this might increase the chance this PR is accepted, but so long as DMD is available for download on the front page of dlang.org I don't think that's the right attitude. This is going to harm anyone who is currently using DMD and std.bitmanip under the assumption—which until this moment has been valid—that it does something reasonable.
I dislike that this PR makes DMD's output much worse without giving any user-visible benefit elsewhere. Prior to this PR DMD does the same thing as LDC.
Well, the union needs to go, because while it happens to work, it's undefined behavior, and the compiler should be able to optimize bitshifts such that there is no performance change here. If I had to guess, the issue is that each bitshift is done individually (because that was easier to do generically) rather than as a single expression, though dmd's optimizer may just be bad at optimizing this particular operation.
I guess that I'll have to try it with a string mixin instead so that I can generate a single expression, much as the code is uglier that way.
Actually, the disassembly that you were showing was for the nativeTo*Endian functions, which need to assign to each byte individually (whereas the *EndianToNative functions could do the assignment from the bytes in one statement), so I don't think that that can be turned into a single statement with bitshifts, which would mean that the issue here is that dmd's optimizer isn't smart enough.
Type punning with a union is not undefined behavior in C. See this footnote from the C11 standard:
If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ''type punning'').