marian icon indicating copy to clipboard operation
marian copied to clipboard

OSx installation fails with clang error

Open sshleifer opened this issue 4 years ago • 14 comments

I have been struggling for a few hours to install on OSX and was wondering whether you guys have any tips.

Cmake seems to terminate successfully, but then make -j4 breaks.

cmake .. -DCOMPILE_CUDA=off

cmake output

-- Project name: marian
-- Project version: v1.9.0+3c7a88f4
CMake Warning at CMakeLists.txt:55 (message):
  CMAKE_BUILD_TYPE not set; setting to Release


-- Checking support for CPU intrinsics
-- Could not find hardware support for AVX2 on this machine.
-- Could not find hardware support for AVX512 on this machine.
-- SSE2 support found
-- SSE3 support found
-- SSE4.1 support found
-- AVX support found
CMake Warning at CMakeLists.txt:293 (message):
  COMPILE_CUDA=off : Building only CPU version


-- Found Tcmalloc: /usr/local/lib/libtcmalloc_minimal.dylib
-- Found Doxygen: /usr/local/bin/doxygen (found version "1.8.17") found components: doxygen missing components: dot
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/shleifer/marian/build

make is all green until 94% then fails with

 [94%] Linking CXX executable ../marian-conv
ld: unknown option: --start-group
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [marian-vocab] Error 1
make[1]: *** [src/CMakeFiles/marian_vocab.dir/all] Error 2

(make -j4 fails similarly) Has anyone seen anything like this?

Environment:

Apple clang version 11.0.0 (clang-1100.0.33.17)
Target: x86_64-apple-darwin19.0.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

sshleifer avatar Apr 07 '20 21:04 sshleifer

@XapaJIaMnu Any hints?

emjotde avatar Apr 07 '20 21:04 emjotde

@sshleifer What version of MacOs are you using? I have tested the compilation with:

xapajiamnu@dhcp-91-025 marian-dev % sw_vers -productVersion
10.15.3
xapajiamnu@dhcp-91-025 marian-dev % clang++ --version
Apple clang version 11.0.3 (clang-1103.0.32.29)
Target: x86_64-apple-darwin19.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Which seems to be slightly newer than yours. My guess would be that your (very slightly outdated) version of llvm doesn't support the --start-group switch. Is upgrade possible for you?

For a short term solution do the following:

  1. Compile as normally (cmake -DCOMPILE_CUDA=OFF) and make -j4
  2. Once compilation fails run make VERBOSE=1 This will output the exact compile commands on the terminal. Now copy those to a text editor, and remove the --start-group flag from each of them. Then, paste them into the terminal and that will allow you to manually finish the compilation.

If this fixes the problem, great! We would be able to add this to some exceptions, but it is annoying when we can't reproduce this on our side.

XapaJIaMnu avatar Apr 07 '20 22:04 XapaJIaMnu

I upgrade to 10.15.4 and it didn't work -> identical error message. First offending line:

[ 94%] Linking CXX executable ../marian-conv
cd /Users/shleifer/marian/build/src && /usr/local/Cellar/cmake/3.16.2/bin/cmake -E cmake_link_script CMakeFiles/marian_conv.dir/link.txt --verbose=1
/Library/Developer/CommandLineTools/usr/bin/c++  -std=c++11 -pthread  -fPIC -Wno-unused-result -Wno-unknown-warning-option -Wno-unknown-cuda-version -march=native  -msse2 -msse3 -msse4.1 -mavx -DMKL_ILP64 -m64 -Ofast -m64 -funroll-loops -ffinite-math-only -g  -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names  CMakeFiles/marian_conv.dir/command/marian_conv.cpp.o  -o ../marian-conv  ../libmarian.a /usr/local/lib/libtcmalloc_minimal.dylib -Wl,--start-group /opt/intel/mkl/lib/libmkl_intel_ilp64.a /opt/intel/mkl/lib/libmkl_sequential.a /opt/intel/mkl/lib/libmkl_core.a -Wl,--end-group -liconv /usr/local/lib/libtcmalloc_minimal.dylib -Wl,--start-group /opt/intel/mkl/lib/libmkl_intel_ilp64.a /opt/intel/mkl/lib/libmkl_sequential.a /opt/intel/mkl/lib/libmkl_core.a -Wl,--end-group -liconv
ld: unknown option: --start-group
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [marian-conv] Error 1
make[1]: *** [src/CMakeFiles/marian_conv.dir/all] Error 2
make: *** [all] Error 2

I have a successful build on a linux machine so probaly will not invest in removing --start-group, but thanks for the idea and the help!

sshleifer avatar Apr 08 '20 00:04 sshleifer

I upgrade to 10.15.4 and it didn't work -> identical error message. First offending line:

[ 94%] Linking CXX executable ../marian-conv
cd /Users/shleifer/marian/build/src && /usr/local/Cellar/cmake/3.16.2/bin/cmake -E cmake_link_script CMakeFiles/marian_conv.dir/link.txt --verbose=1
/Library/Developer/CommandLineTools/usr/bin/c++  -std=c++11 -pthread  -fPIC -Wno-unused-result -Wno-unknown-warning-option -Wno-unknown-cuda-version -march=native  -msse2 -msse3 -msse4.1 -mavx -DMKL_ILP64 -m64 -Ofast -m64 -funroll-loops -ffinite-math-only -g  -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names  CMakeFiles/marian_conv.dir/command/marian_conv.cpp.o  -o ../marian-conv  ../libmarian.a /usr/local/lib/libtcmalloc_minimal.dylib -Wl,--start-group /opt/intel/mkl/lib/libmkl_intel_ilp64.a /opt/intel/mkl/lib/libmkl_sequential.a /opt/intel/mkl/lib/libmkl_core.a -Wl,--end-group -liconv /usr/local/lib/libtcmalloc_minimal.dylib -Wl,--start-group /opt/intel/mkl/lib/libmkl_intel_ilp64.a /opt/intel/mkl/lib/libmkl_sequential.a /opt/intel/mkl/lib/libmkl_core.a -Wl,--end-group -liconv
ld: unknown option: --start-group
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [marian-conv] Error 1
make[1]: *** [src/CMakeFiles/marian_conv.dir/all] Error 2
make: *** [all] Error 2

I have a successful build on a linux machine so probaly will not invest in removing --start-group, but thanks for the idea and the help!

Could you double check that your clang++ version? Do you by any chance have ld be a different version from clang++?

XapaJIaMnu avatar Apr 08 '20 01:04 XapaJIaMnu

@sshleifer : I had faced the same error on my mac OS (Version 10.14.6). After some insane googling, I finally managed to fix it by removing the "Wl,--start-group,--end-group" flags in each module's "link.txt" file. FYI to make it clear - You still keep the values after the flags and just remove the flags.

rakeshchada avatar Apr 15 '20 20:04 rakeshchada

@ugermann can we add those to ignore list?

XapaJIaMnu avatar Apr 20 '20 16:04 XapaJIaMnu

git blame cmake/FindMKL.cmake

look for --start-group

ugermann avatar Apr 21 '20 00:04 ugermann

It's a mess. Apparently Intel does not support cmake, so you have to go to a web site and select your OS, compiler, Intel MKL version etc. from dropdown lists to get a custom link line. https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor.

That appears to be Intel's official approach to promoting software portability ... :man_facepalming:

There's a zoo of FindMKL.cmake files out there, so there may be something better out there. How important is it to have MKL on Mac? There is a USE_MKL option in the CMakeList.txt. And while clang apparently supports --{start|end}-group, it does not on Mac. My suggestion would be to configure on MAC with

-DUSE_MKL=off

for the time being. Mac users will be so busy adoring their machine that they won't notice the speed difference anyway :grin:.

ugermann avatar Apr 23 '20 02:04 ugermann

We could try using this instead: https://github.com/pytorch/pytorch/blob/master/cmake/Modules/FindMKL.cmake

ugermann avatar Apr 23 '20 02:04 ugermann

Might be the third or fourth iteration :) Let's just be careful to not mess up everyone else's MKL finding in the process.

emjotde avatar Apr 23 '20 02:04 emjotde

That's exactly why I'd prefer to stay out of this ...

ugermann avatar Apr 23 '20 02:04 ugermann

Hello, Is it possible to build marian on mac with gpu support ? (Didn't want to create an issue for that ) I know that the latest architectures are not supported but older ones, like Pascal, still work on mac os High sierra.

kadir-gunel avatar Oct 08 '20 05:10 kadir-gunel

@kadir-gunel , since CUDA stopped being supported on mac since High Sierra, and we don't have older test macs with GPU drivers, we haven't tried doing a mac build with cuda. If you have a GPU mac with high sierra lying around, you can try compiling it with the latest supported cuda and hope.

XapaJIaMnu avatar Oct 09 '20 09:10 XapaJIaMnu

@XapaJIaMnu thank you. Yes, this is exactly my case having high sierra with 1080ti. I will take a shot.

kadir-gunel avatar Oct 09 '20 10:10 kadir-gunel