TransformerEngine icon indicating copy to clipboard operation
TransformerEngine copied to clipboard

[ERROR] cannot install the package,

Open xju2 opened this issue 1 year ago • 1 comments

The CMAKE configuration failed with the following error. The same error is observed in both stable and main branch.

      -- JAX support: OFF
      -- Configuring done
      CMake Error at common/CMakeLists.txt:39 (add_library):
        Target "transformer_engine" links to target "CUDA::cublas" but the target
        was not found.  Perhaps a find_package() call is missing for an IMPORTED
        target, or an ALIAS target is missing?

The CUDA::cublas is installed throught pip, specifically nvidia-cublas-cu12==12.1.3.1.

xju2 avatar Apr 23 '24 23:04 xju2

@ksivaman Could you take a look?

ptrendx avatar May 16 '24 18:05 ptrendx

It's odd that it didn't fail when it searches for cuBLAS: https://github.com/NVIDIA/TransformerEngine/blob/115a27ef2b7d206f8fc6634cfdec692913578ffc/transformer_engine/CMakeLists.txt#L22 Also, the cuBLAS pip wheel is intended for runtime use and doesn't include developer tools (https://docs.nvidia.com/cuda/cuda-installation-guide-linux/#pip-wheels). Building TE requires the CUDA Toolkit, which includes cuBLAS.

timmoon10 avatar May 21 '24 01:05 timmoon10

@xju2 Adding to Tim's comment, are you building on a cuda compatible device with the toolkit installed? If so, could you try building without installing nvidia-cublas-cu12==12.1.3.1 via pip, this should not be required.

ksivaman avatar May 21 '24 01:05 ksivaman

It turns out to be an issue with the old cmake version. It failed with cmake 3.20.4 but worked with cmake 3.24.3. It is probably a good time to update the cmake requirement, https://github.com/NVIDIA/TransformerEngine/blob/main/transformer_engine/CMakeLists.txt.

xju2 avatar May 24 '24 00:05 xju2