libcudacxx icon indicating copy to clipboard operation
libcudacxx copied to clipboard

The build system is a big pile of hacks

Open jengelh opened this issue 4 years ago • 18 comments

On commit 5d9c7123ae8ddb71cc13c14d7eea623b60ee436c, even a simple configure run does not work on openSUSE Tumbleweed x86_64.

» cmake .
-- The CXX compiler identification is GNU 10.2.1
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The C compiler identification is unknown
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - failed
-- Check for working C compiler: /usr/bin/c++
-- Check for working C compiler: /usr/bin/c++ - broken
CMake Error at /usr/share/cmake/Modules/CMakeTestCCompiler.cmake:66 (message):
  The C compiler

    "/usr/bin/c++"

  is not able to compile a simple test program.

Since I have nothing in the shell environment that sets a compiler, this must come from one of the cmake files in your project, and surely enough, just grepping for it reveals something:

CMakeLists.txt:8:if (NOT CMAKE_C_COMPILER)
CMakeLists.txt:9:  set(CMAKE_C_COMPILER ${CMAKE_CXX_COMPILER})
CMakeLists.txt:10:  set(CMAKE_C_COMPILER_ARG1 ${CMAKE_CXX_COMPILER_ARG1})

jengelh avatar Sep 19 '20 10:09 jengelh

Removing the four line 8-11 fixes the issue.

jengelh avatar Sep 19 '20 10:09 jengelh

The current CMakeFiles are really not geared towards anything but generating a lit config file to run the test suite with NVCC as the compiler; see the testing guide. There's a fair amount of abuse in them, but you should not need to invoke cmake for libcu++ at all unless you are planning to run tests - in which case you should in fact use NVCC.

We are planning to modify this at least to the point where it's possible to have working add_subdirectory(libcudacxx), but it is not very likely that we'll invest in enabling configuring the main project without using a CUDA C++ compiler.

griwes avatar Sep 19 '20 10:09 griwes

We should nuke the entire current CMake configuration from orbit, it's a mess.

brycelelbach avatar Sep 19 '20 23:09 brycelelbach

@griwes maybe we should put in an advisory or something in CMake that tells people they probably don't need to build it.

brycelelbach avatar Sep 20 '20 00:09 brycelelbach

There isn't a build & install guide either.

Will this be addressed soon?

justicezyx avatar Sep 20 '20 01:09 justicezyx

@justicezyx there's a building_and_testing.md doc if you're looking for how to run the tests.

trxcllnt avatar Sep 20 '20 01:09 trxcllnt

@justicezyx there's nothing to install. It's header only.

brycelelbach avatar Sep 20 '20 07:09 brycelelbach

there's nothing to install. It's header only.

Even headers surely should end up somewhere in /usr.

jengelh avatar Sep 20 '20 07:09 jengelh

If you are installing libcu++ globally in your system, you should probably just install the CUDA Toolkit; if you need a newer version of libcu++ for a specific project, you should probably "install" it locally for that project only. It's hard for me to envision a situation where manually installing libcu++ to /usr is the best course of action. If you have such a situation in mind, could you describe it?

griwes avatar Sep 20 '20 07:09 griwes

you should probably just install the CUDA Toolkit

That is quite insensitive. Someone has to make the toolkit rpm first (even if that's nvidia itself), and what better way than to just run make install from the %install section, because that way, installation is in the hands of the libcuxx source archive and the rpm spec file does not have to chase after new headers or sometihng.

jengelh avatar Sep 20 '20 09:09 jengelh

Usually, the best way to install the Toolkit on an arbitrary Linux system is using the distribution-agnostic runfile. To actually take advantage of this library, you do need other components of the Toolkit anyway - at the very least the compiler. And once you have the compiler, you also have libcu++ in the same Toolkit installation.

If you want to use libcu++ newer than the one in the Toolkit you are using, then - as I said earlier - I'd recommend doing it on a per-project basis; installing it globally in /usr can be problematic in two ways, though it's possible the second one is made irrelevant by the first one, I'm not sure:

  1. NVCC sets its own include paths quite high on the list of include paths it passes to the host compiler for preprocessing, so it's quite possible that the toolkit-installed version will just be in the include paths earlier than what you may put in /usr. I know we've run into this sort of an issue earlier, and I think you'd run into it here.
  2. You'd override the version globally for all projects, regardless of the Toolkit version they are using, even if they aren't under your control, which can lead to negative interactions with parts of the Toolkit that have specific expectations (like the ABI version) of the libcu++ version being available.

Having libcu++ either submoduled or otherwise manually fetched by your project's build system for the use case of using a version newer than the one in the toolkit is going to make both of those issues go away, which is why that's the way I'd recommend for "installing" libcu++ alone for use in your projects.

griwes avatar Sep 20 '20 10:09 griwes

And to be clear, I'm not saying "we'll never have an install target". What I'm saying is "the install target you're asking for would probably not achieve what you need", if that makes sense ;)

griwes avatar Sep 20 '20 10:09 griwes

Install rules should definitely be added at some point.

Re: global overrides, the install doesn't have to target /usr. The install location is customizable by setting CMAKE_INSTALL_PREFIX, and can be handy for several common usecases where someone just wants to generate the minimal set of files needed for the project to work, but not dev files like tests etc (think HPC module systems, linux distro packaging, etc). Without project-defined install rules, the packager has to do a lot more work to figure out what's important to package and what isn't, and might not get it right. It'd be best if we defined what the minimal installed product should look like to eliminate this guesswork and play nice with existing opensource conventions.

Re: NVCC/CTK include paths, as long as the path is added with -I and not -isystem, it will override the CTK includes. This is annoying with CMake imported targets, but I've got some workarounds in Thrust/CUB that we can use here when the time comes.

alliepiper avatar Sep 20 '20 16:09 alliepiper

Agreed on the first point, yes; I think we should do it once NVIDIA/cccl#937 is done, since it'll be rather trivial at that point (just grab everything in include/).

On the point of include order I had something else in mind: if you did in fact install the library in /usr as has been mentioned as something that should happen above, I think you'd end up with the headers on the NVCC include path, but behind the path of libcu++ that's bundled into the toolkit, right?

griwes avatar Sep 20 '20 20:09 griwes

From what I've seen when experimenting with this on Thrust, nvcc's include order is:

  1. Explicit -I include paths
  2. CTK include path (implicitly added)
  3. Explicit -isystem include paths

So installing to /usr/include/cuda/.... and doing nvcc -I/usr/include [...] should use the installed headers and not the CTK ones.

alliepiper avatar Sep 20 '20 20:09 alliepiper

Yeah - and I'd argue that if you are doing -I/usr/include, there's probably a better way to go about it ;) (Plus this could actually break the build if we ever provided wrappers for the system includes, like NVC++ in the HPC SDK does for a couple of headers.)

griwes avatar Sep 20 '20 20:09 griwes

If this issue is now tracking "the build system is a giant pile of hacks and needs to be improved", please update the title accordingly.

brycelelbach avatar Sep 21 '20 02:09 brycelelbach

I'd like to add an option to Thrust that will optionally fetch and use libcu++ via CPM at configure time. For this to happen, libcu++ will need to support the add_subdirectory usecase.

Adding this here as a comment since this seems to be the uber-issue for build system stuff.

alliepiper avatar Oct 19 '20 16:10 alliepiper

@allisonvacanti given that the build system will soon be reworked do we want to close this or keep it open for now?

miscco avatar Feb 23 '23 10:02 miscco

Most of these issues have been addressed in recent months -- if there are any outstanding issues that need to be fixed, open a new, more specific issue.

alliepiper avatar Feb 23 '23 16:02 alliepiper