vecmem
vecmem copied to clipboard
Atomic Failure on Windows with CUDA, main branch (2024.08.03.)
Unfortunately #275 seems to have introduced a failure in a setup that's not too easy to test in the CI. 😦 While building the latest code on Windows, with CUDA available, I bump into:
...
[build] tmpxft_000062a0_00000000-7_test_cuda_jagged_vector_view_kernels.cudafe1.cpp
[build] tmpxft_0000606c_00000000-7_test_cuda_edm_kernels.cudafe1.cpp
[build] C:\Users\krasz\ATLAS\vecmem\vecmem\core\include\vecmem/memory/impl/device_atomic_ref.ipp(187): error C3861: '__memorder_vecmem_to_builtin': identifier not found [C:\Users\krasz\ATLAS\vecmem\vecmem\build\tests\cuda\vecmem_test_cuda.vcxproj]
[build] C:\Users\krasz\ATLAS\vecmem\vecmem\core\include\vecmem/memory/impl/device_atomic_ref.ipp(187): note: the template instantiation context (the oldest one first) is
[build] C:\Users\krasz\ATLAS\vecmem\vecmem\tests\cuda\../common/jagged_soa_container_helpers.hpp(21): note: see reference to class template instantiation 'vecmem::testing::jagged_soa_interface<vecmem::edm::device<vecmem::edm::schema<vecmem::edm::type::scalar<int>,vecmem::edm::type::vector<float>,vecmem::edm::type::jagged_vector<double>,vecmem::edm::type::scalar<float>,vecmem::edm::type::jagged_vector<int>,vecmem::edm::type::vector<int>>>>' being compiled
[build] C:\Users\krasz\ATLAS\vecmem\vecmem\tests\common\jagged_soa_container.hpp(18): note: see reference to class template instantiation 'vecmem::edm::device<vecmem::edm::schema<vecmem::edm::type::scalar<int>,vecmem::edm::type::vector<float>,vecmem::edm::type::jagged_vector<double>,vecmem::edm::type::scalar<float>,vecmem::edm::type::jagged_vector<int>,vecmem::edm::type::vector<int>>>' being compiled
[build] C:\Users\krasz\ATLAS\vecmem\vecmem\core\include\vecmem/edm/device.hpp(121): note: see reference to class template instantiation 'vecmem::tuple<int *,vecmem::device_vector<TYPE>,vecmem::jagged_device_vector<double>,float *,vecmem::jagged_device_vector<int>,vecmem::device_vector<int>>' being compiled
[build] with
[build] [
[build] TYPE=float
[build] ]
...
[build] C:\Users\krasz\ATLAS\vecmem\vecmem\core\include\vecmem/memory/impl/device_atomic_ref.ipp(294): error C3861: '__atomic_fetch_add': identifier not found [C:\Users\krasz\ATLAS\vecmem\vecmem\build\tests\cuda\vecmem_test_cuda.vcxproj]
[build] C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.6.targets(799,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\bin\nvcc.exe" --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64" -x cu -IC:\Users\krasz\ATLAS\vecmem\vecmem\build\core\CMakeFiles -IC:\Users\krasz\ATLAS\vecmem\vecmem\core\include -IC:\Users\krasz\ATLAS\vecmem\vecmem\build\cuda\CMakeFiles -IC:\Users\krasz\ATLAS\vecmem\vecmem\cuda\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\include" -I"C:\Users\krasz\ATLAS\vecmem\vecmem\build\_deps\googletest-src\googletest\include" -I"C:\Users\krasz\ATLAS\vecmem\vecmem\build\_deps\googletest-src\googletest" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\include" --keep-dir vecmem_test_cuda\x64\RelWithDebInfo -maxrregcount=0 --machine 64 --compile -cudart static -std=c++14 --generate-code=arch=compute_52,code=[compute_52,sm_52] -Xcompiler="/EHsc -Zi -Ob1 /Zc:__cplusplus" -D_WINDOWS -DNDEBUG -DVECMEM_DEBUG_MSG_LVL=0 -DVECMEM_SOURCE_DIR_LENGTH=35 -DVECMEM_HAVE_PMR_MEMORY_RESOURCE -D"CMAKE_INTDIR=\"RelWithDebInfo\"" -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -DVECMEM_DEBUG_MSG_LVL=0 -DVECMEM_SOURCE_DIR_LENGTH=35 -DVECMEM_HAVE_PMR_MEMORY_RESOURCE -D"CMAKE_INTDIR=\"RelWithDebInfo\"" -Xcompiler "/EHsc /W4 /nologo /O2 /FS /Zi /MD /GR" -Xcompiler "/Fdvecmem_test_cuda.dir\RelWithDebInfo\vc143.pdb" -o vecmem_test_cuda.dir\RelWithDebInfo\test_cuda_edm_kernels.obj "C:\Users\krasz\ATLAS\vecmem\vecmem\tests\cuda\test_cuda_edm_kernels.cu"" exited with code 2. [C:\Users\krasz\ATLAS\vecmem\vecmem\build\tests\cuda\vecmem_test_cuda.vcxproj]
So nvcc
is trying to compile test_cuda_edm_kernels.cu
, and gets confused about how to use vecmem::device_atomic_ref
.
The preprocessor flags will need to be tweaked to fix this. 🤔