ldc icon indicating copy to clipboard operation
ldc copied to clipboard

ldc2 with -g flag fails to link on Apple M1 Pro (aarch64) with unaligned pointer

Open schveiguy opened this issue 4 years ago • 50 comments

I removed everything I could think of:

% cat testit.d
extern(C) void main() {}
% ldc2 -betterC testit.d
% ./testit              
% ldc2 -betterC -g testit.d
ld: warning: pointer not aligned at address 0x10000402D (anon + 45 from testit.o)
ld: unaligned pointer(s) for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Error: /usr/bin/cc failed with status: 1

schveiguy avatar Nov 04 '21 22:11 schveiguy

Some sleuthing reveals that -platform_version macos 12.0.0 12.0 is the argument passed to the linker cause this to fail. On Big Sur (OS version 11) -platform_version macos 11.0.0 12.0 is passed instead.

This latter command works, even with the same object file. The first number of that argument is the min platform version.

thewilsonator avatar Nov 04 '21 23:11 thewilsonator

As a workaround, if I set the MACOSX_DEPLOYMENT_TARGET environment variable to 11 it will link successfully.

jacob-carlborg avatar Nov 08 '21 16:11 jacob-carlborg

I guess ld64 pedantically checks DWARF alignment specs and complains about an invalid one; I guess alignment of pointer itself vs. pointee alignment or something along these lines.

kinke avatar Nov 08 '21 16:11 kinke

Thx for testing, Steven. 2nd guess: https://github.com/ldc-developers/ldc/blob/5a28329e2f027ff2e3ea77a299f245e5baa4247c/driver/targetmachine.cpp#L40-L50

(testable via -preserve-dwarf-line-section=false)

kinke avatar Nov 26 '21 01:11 kinke

2nd guess

That seems to have worked. However, I now do not have file/line numbers as I did when targeting MacOS 11.

steves@MacBook-Pro objdraw % export MACOSX_DEPLOYMENT_TARGET=11                                                                      
steves@MacBook-Pro objdraw % cat testthrow.d                   
void foo()
{
    throw new Exception("hi");
}

void main()
{
    foo();
}
steves@MacBook-Pro objdraw % ldc2 -g testthrow.d 
steves@MacBook-Pro objdraw % ./testthrow 
[email protected](3): hi
----------------
testthrow.d:3 void testthrow.foo() [0x1021d422b]
testthrow.d:8 _Dmain [0x1021d4237]
steves@MacBook-Pro objdraw % ldc2 -g -preserve-dwarf-line-section=false testthrow.d
steves@MacBook-Pro objdraw % ./testthrow                                           
[email protected](3): hi
----------------
??:? void testthrow.foo() [0x10062022b]
??:? _Dmain [0x100620237]
steves@MacBook-Pro objdraw % unset MACOSX_DEPLOYMENT_TARGET
steves@MacBook-Pro objdraw % ldc2 -g testthrow.d                                   
ld: warning: pointer not aligned at address 0x100294081 (anon + 129 from testthrow.o)
ld: unaligned pointer(s) for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Error: /usr/bin/cc failed with status: 1
steves@MacBook-Pro objdraw % ldc2 -g -preserve-dwarf-line-section=false testthrow.d
steves@MacBook-Pro objdraw % ./testthrow
[email protected](3): hi
----------------
??:? void testthrow.foo() [0x104820c5b]
??:? _Dmain [0x104820c67]

schveiguy avatar Nov 26 '21 18:11 schveiguy

Thx for testing again - great, so that's it. Apple might have changed the linker scripts in more recent SDKs then. Note that this is a LLVM-hack on our side, as workaround for a druntime limitation (cannot use default .dSYM files) tracked in https://issues.dlang.org/show_bug.cgi?id=20510.

kinke avatar Nov 26 '21 18:11 kinke

OK. For now, I guess I will stick with my original workaround, since i get line/file data out of it.

schveiguy avatar Nov 26 '21 18:11 schveiguy

A question about the "LLVM hack". Is that with the switch or default behavior? Because I would think it's good for the default ldc2 to actually link on MacOS Monterey.

schveiguy avatar Nov 26 '21 18:11 schveiguy

The hack is enabled by default - as it wasn't known to cause trouble and provides the expected file/line infos. Changing that logic depending on target macOS version probably requires specifying a versioned macOS triple via -mtriple and/or ugly env var/host detection logic. Do these apparently fatal linker warnings happen for x64 too, or is that ARM64 only? What about the other Darwin variants? Etc.

kinke avatar Nov 26 '21 18:11 kinke

~~When I get around to updating my old mac to Monterey, I can test there.~~

Scratch that, my old macbook pro is too old to receive the update. Someone else would have to test x64

schveiguy avatar Nov 26 '21 18:11 schveiguy

That said, @Geod24 seems to have started a .dSYM implementation, so it might make more sense to push that rather than wasting time on hack applicability detection here.

kinke avatar Nov 26 '21 19:11 kinke

Do these apparently fatal linker warnings happen for x64 too

It works fine on x86-64.

jacob-carlborg avatar Nov 26 '21 19:11 jacob-carlborg

That said, Geod24 seems to have started a .dSYM implementation, so it might make more sense to push that rather than wasting time on hack applicability detection here.

LDC will then need to start producing .dSYM files.

jacob-carlborg avatar Nov 26 '21 19:11 jacob-carlborg

LDC will then need to start producing .dSYM files.

I thought that's what ld64 does by default (at least with -preserve-dwarf-line-section=false). Anyway, PRs from Apple users are always welcome. :) Speaking of which, I could use some help in https://github.com/ldc-developers/ldc/pull/3871 too - LDC cannot be linked due to missing libm, libz, libxml2… I'm wondering whether that has to do with https://github.com/ldc-developers/ldc/blob/5a28329e2f027ff2e3ea77a299f245e5baa4247c/azure-pipelines.yml#L112 (as the official LDC Mac x64 build has been compatible with 10.9+ for a while and that ideally wouldn't have to change)

kinke avatar Nov 26 '21 20:11 kinke

Speaking of which, I could use some help in #3871 too

I answered in the PR.

jacob-carlborg avatar Nov 27 '21 08:11 jacob-carlborg

@Geod24 Possible to sponsor https://issues.dlang.org/show_bug.cgi?id=20510 in order to have #3864 fixed? Definately would be interested in keeping Mac compat.

Contact me for testing x86_64 or arm64 binaries, macOS Monterey (arm64/Rosetta) and 10.14 (x86_64) installed here.

p0nce avatar Dec 18 '21 11:12 p0nce

I sadly don't have too much time for LDC development these days, but feel free to ping me if you need any testing as well – just got a M1 Max/Monterey box.

dnadlinger avatar Dec 18 '21 23:12 dnadlinger

I sadly don't have too much time for LDC development these days, but feel free to ping me if you need any testing as well – just got a M1 Max/Monterey box.

I now too have an M1 Macbook Air, so can test too.

Off topic: Already ran into troubles of course :). For now, I managed to build LDC (arm64 binary) by setting MACOSX_DEPLOYMENT_TARGET=11 and using the LDC 1.27.0 arm64 released binary as bootstrapping compiler. Running cmake the first time, it errors with

CMake Error at cmake/Modules/ExtractDMDSystemLinker.cmake:42 (message):
  Failed to link empty D program using
  '/Users/johan/dcompilers/ldc-1.27.1/bin/ldmd2 -wi -link-debuglib':

  ld: library not found for -lpthread

  clang: error: linker command failed with exit code 1 (use -v to see
  invocation)

  Error: /Library/Developer/CommandLineTools/usr/bin/cc failed with status: 1
Call Stack (most recent call first):
  CMakeLists.txt:670 (include)

simply rerunning cmake "fixes" it, and happily finishes.

JohanEngelen avatar Dec 28 '21 14:12 JohanEngelen

I cannot reproduce this on M1 Macbook Air with LDC master 6542126.

My local LDC is built by setting MACOSX_DEPLOYMENT_TARGET=11 (only when running cmake, not when executing the tests below, as you can see from the first line) and using the LDC 1.27.1 arm64 released binary as bootstrapping compiler.

❯ echo $MACOSX_DEPLOYMENT_TARGET

❯ file bin/ldc2
bin/ldc2: Mach-O 64-bit executable arm64
❯ bin/ldc2 --version
LDC - the LLVM D compiler (1.28.1-git-6542126):
  based on DMD v2.098.1 and LLVM 12.0.1
  built with LDC - the LLVM D compiler (1.27.1)
  Default target: arm64-apple-darwin21.1.0
  ...
❯ cat testbadlink.d
import core.stdc.stdio;
extern(C) void main() {
	printf("Hello!");
}
❯ bin/ldc2 -betterC -g -run testbadlink.d
Hello!
❯ bin/ldc2 -betterC -run testbadlink.d
Hello!

No errors.

The released LDC 1.27.1 arm64 binaries do indeed not work:

❯ ~/dcompilers/ldc-1.27.1/bin/ldc2 --version
LDC - the LLVM D compiler (1.27.1):
  based on DMD v2.097.2 and LLVM 12.0.1
  built with LDC - the LLVM D compiler (1.27.1)
  Default target: arm64-apple-darwin21.1.0
 ...
❯ ~/dcompilers/ldc-1.27.1/bin/ldc2 -betterC -run testbadlink.d
Hello!
❯ ~/dcompilers/ldc-1.27.1/bin/ldc2 -betterC -g -run testbadlink.d
ld: warning: pointer not aligned at address 0x100008032 (anon + 50 from /var/folders/gm/9sswz8kd70n8tsdbct_fb0vh0000gp/T/objtmp-ldc-029e51/testbadlink.o)
ld: unaligned pointer(s) for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Error: /usr/bin/cc failed with status: 1

Building LDC 1.27.1 locally:

❯ bin/ldc2 --version
LDC - the LLVM D compiler (1.27.1):
  based on DMD v2.097.2 and LLVM 12.0.1
  built with LDC - the LLVM D compiler (1.27.1)
  Default target: arm64-apple-darwin21.1.0
  ...
❯ bin/ldc2 -betterC -run testbadlink.d
Hello!
❯ bin/ldc2 -betterC -g -run testbadlink.d
Hello!

So perhaps something wrong in the way the LDC 1.27.1 release package is made?

JohanEngelen avatar Dec 28 '21 14:12 JohanEngelen

So perhaps something wrong in the way the LDC 1.27.1 release package is made?

Hmm, I guessing MACOSX_DEPLOYMENT_TARGET could cause this, when 1.27.1 release package is built.

jacob-carlborg avatar Dec 28 '21 19:12 jacob-carlborg

Extra info: if I use my self-built ldc-1.27.1 to again build ldc-1.27.1, I no longer need MACOSX_DEPLOYMENT_TARGET=11 to build ldc.

JohanEngelen avatar Dec 28 '21 23:12 JohanEngelen

@JohanEngelen what's the deployment target of that binary? If you run otool -l <binary> and look for the LC_BUILD_VERSION load command, then look at the minos field.

jacob-carlborg avatar Dec 29 '21 07:12 jacob-carlborg

@JohanEngelen what's the deployment target of that binary? If you run otool -l <binary> and look for the LC_BUILD_VERSION load command, then look at the minos field.

On the binaries produced by the new LDC (that does not have the -g problem discussed here):

      cmd LC_BUILD_VERSION
  cmdsize 32
 platform 1
    minos 12.0
      sdk 12.1
   ntools 1
     tool 3
  version 710.1

JohanEngelen avatar Dec 29 '21 11:12 JohanEngelen

The prebuilt LDC 1.27.0 has minos set to 11.0. I'm wondering if that's what's causing the issue. But the deployment target should be determined at runtime (of the compiler). The deployment target of the compiler shouldn't matter, as long as it's possible to run the binary.

jacob-carlborg avatar Dec 29 '21 14:12 jacob-carlborg

Does minos = 12.0 really mean that the binary cannot be run on <12 macOS?

JohanEngelen avatar Dec 30 '21 00:12 JohanEngelen

Does minos = 12.0 really mean that the binary cannot be run on <12 macOS?

I haven't actually tried. But I know for sure that that the deployment target controls what features the binary can use. For example, if you try to compile a binary that uses thread local storage and set the deployment target to 10.6, you'll get an error. Because a binary with TLS cannot run on anything older than 10.7.

jacob-carlborg avatar Dec 30 '21 08:12 jacob-carlborg

How do we continue? Shall I prepare an osx-arm64 release package of 1.28.0 using my steps that solve this issue, and upload it (with recognizable different name) to the 1.28.0 release page, so people can test it? If that works, we can figure out how to fix it in the release automization.

JohanEngelen avatar Dec 30 '21 12:12 JohanEngelen

How do we continue?

I guess remove/disable the __debug_line hack and fix https://issues.dlang.org/show_bug.cgi?id=20510.

Shall I prepare an osx-arm64 release package of 1.28.0 using my steps that solve this issue

I prefer to avoid having separate releases for different versions of the OS.

jacob-carlborg avatar Dec 30 '21 12:12 jacob-carlborg

Shall I prepare an osx-arm64 release package of 1.28.0 using my steps that solve this issue

I prefer to avoid having separate releases for different versions of the OS.

That's not what I meant. I have a version that works for me on OS 12. I wonder if it works for other people too, on macOS 11.

JohanEngelen avatar Dec 30 '21 12:12 JohanEngelen

How do we continue?

I guess remove/disable the __debug_line hack and fix https://issues.dlang.org/show_bug.cgi?id=20510.

Ahh..... I'm using vanilla LLVM, not the LDC version, so that's why my version works... Now I see the hack that was applied. Perhaps it's sufficient to simply increase the alignment of __debug_line section. I'll test that. Edit: No, makes no difference, tried aligning all sections to 8 bytes. (Btw, LDC cannot be built with LDC-LLVM because of this bug, it fails on building its own druntime.)

Output of otool -l on an object file created by 1.27.1 released LDC (i.e. with the __debug_line hack): Edit: this is irrelevant.

Section
  sectname __debug_line
   segname __DWARF
      addr 0x0000000000000218
      size 0x0000000000000064
    offset 1624
     align 2^0 (1)     <-------------
    reloff 1776
    nreloc 1
     flags 0x00000000

JohanEngelen avatar Dec 30 '21 13:12 JohanEngelen