flash-attention I thought nvcc wasnt required @ 2.0.7 but is again?

Collecting flash-attn
  Using cached flash_attn-2.1.1.tar.gz (2.3 MB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [14 lines of output]
      fatal: not a git repository (or any of the parent directories): .git
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-ulfkr2zw/flash-attn_4b67593759ae48efb90864a00b44d597/setup.py", line 120, in <module>
          raise_if_cuda_home_none("flash_attn")
        File "/tmp/pip-install-ulfkr2zw/flash-attn_4b67593759ae48efb90864a00b44d597/setup.py", line 88, in raise_if_cuda_home_none
          raise RuntimeError(
      RuntimeError: flash_attn was requested, but nvcc was not found.  Are you sure your environment has nvcc available?  If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.


      torch.__version__  = 2.0.1+cu117


      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Sep 01 '23 12:09 teknium1

Weird, can you run with -v and post the output

Sep 01 '23 14:09 tmm1

Weird, can you run with -v and post the output

pip install flash-attn -v
Using pip 22.0.2 from /usr/lib/python3/dist-packages/pip (python 3.10)
Defaulting to user installation because normal site-packages is not writeable
Collecting flash-attn
  Using cached flash_attn-2.1.1.tar.gz (2.3 MB)
  Running command python setup.py egg_info
  fatal: not a git repository (or any of the parent directories): .git
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/tmp/pip-install-t7e36fuj/flash-attn_66c7e5c25a05491facab03cf6f89347e/setup.py", line 120, in <module>
      raise_if_cuda_home_none("flash_attn")
    File "/tmp/pip-install-t7e36fuj/flash-attn_66c7e5c25a05491facab03cf6f89347e/setup.py", line 88, in raise_if_cuda_home_none
      raise RuntimeError(
  RuntimeError: flash_attn was requested, but nvcc was not found.  Are you sure your environment has nvcc available?  If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.


  torch.__version__  = 2.0.1+cu118


  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /usr/bin/python3 -c '
  exec(compile('"'"''"'"''"'"'
  # This is <pip-setuptools-caller> -- a caller that pip uses to run setup.py
  #
  # - It imports setuptools before invoking setup.py, to enable projects that directly
  #   import from `distutils.core` to work with newer packaging standards.
  # - It provides a clear error message when setuptools is not installed.
  # - It sets `sys.argv[0]` to the underlying `setup.py`, when invoking `setup.py` so
  #   setuptools doesn'"'"'t think the script is `-c`. This avoids the following warning:
  #     manifest_maker: standard file '"'"'-c'"'"' not found".
  # - It generates a shim setup.py, for handling setup.cfg-only projects.
  import os, sys, tokenize

  try:
      import setuptools
  except ImportError as error:
      print(
          "ERROR: Can not execute `setup.py` since setuptools is not available in "
          "the build environment.",
          file=sys.stderr,
      )
      sys.exit(1)

  __file__ = %r
  sys.argv[0] = __file__

  if os.path.exists(__file__):
      filename = __file__
      with tokenize.open(__file__) as f:
          setup_py_code = f.read()
  else:
      filename = "<auto-generated setuptools caller>"
      setup_py_code = "from setuptools import setup; setup()"

  exec(compile(setup_py_code, filename, "exec"))
  '"'"''"'"''"'"' % ('"'"'/tmp/pip-install-t7e36fuj/flash-attn_66c7e5c25a05491facab03cf6f89347e/setup.py'"'"',), "<pip-setuptools-caller>", "exec"))' egg_info --egg-base /tmp/pip-pip-egg-info-auuqwy4a
  cwd: /tmp/pip-install-t7e36fuj/flash-attn_66c7e5c25a05491facab03cf6f89347e/
  Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Sep 01 '23 21:09 teknium1

We should have prebuilt wheels for this setting (torch 2.0 cuda 11.8) that setup.py automatically downloads, and nvcc should not be necessary. Are you installing from source or from PyPI (pip install flash-attn --no-build-isolation)?

Can you try again with -vvv to pip?

Sep 01 '23 21:09 tridao

We should have prebuilt wheels for this setting (torch 2.0 cuda 11.8) that setup.py automatically downloads, and nvcc should not be necessary. Are you installing from source or from PyPI (pip install flash-attn --no-build-isolation)?

Can you try again with -vvv to pip?

Ive tried with nobuild isolation and without, my install command is in the codeblock above now (it was being hidden for first line of codeblock. Here is latest with -vvv with and without build isolation: (without): https://pastebin.com/EY7YpzRS (with): https://pastebin.com/UD4rMTAb

Sep 01 '23 21:09 teknium1

I see. The current setup.py might still require nvcc, I'll figure out how to fix later. As a work around for now you can try FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE pip install flash-attn --no-build-isolation

Sep 01 '23 22:09 tridao

This worked. Should I close the issue or wait until resolved properly?

Sep 02 '23 00:09 teknium1

Let's keep this open for now.

Sep 02 '23 01:09 tridao

I think @tridao fixed this in 0c04943fa226ee13762039a86ee4360536c09c5b. Can you try pip install -U flash-attn now?

Sep 05 '23 00:09 tmm1

I think @tridao fixed this in 0c04943. Can you try pip install -U flash-attn now?

Collecting flash-attn
  Downloading flash_attn-2.1.2.post3.tar.gz (2.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 20.2 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [20 lines of output]
      fatal: not a git repository (or any of the parent directories): .git
      /tmp/pip-install-lfqw4621/flash-attn_3a5a1bca3d3d4043b1c739513830dd1f/setup.py:72: UserWarning: flash_attn was requested, but nvcc was not found.  Are you sure your environment has nvcc available?  If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.
        warnings.warn(
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-lfqw4621/flash-attn_3a5a1bca3d3d4043b1c739513830dd1f/setup.py", line 126, in <module>
          CUDAExtension(
        File "/home/teknium/qlora_sweep_3/finetune-study/axolotl/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1048, in CUDAExtension
          library_dirs += library_paths(cuda=True)
        File "/home/teknium/qlora_sweep_3/finetune-study/axolotl/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1179, in library_paths
          if (not os.path.exists(_join_cuda_home(lib_dir)) and
        File "/home/teknium/qlora_sweep_3/finetune-study/axolotl/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2223, in _join_cuda_home
          raise EnvironmentError('CUDA_HOME environment variable is not set. '
      OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.


      torch.__version__  = 2.0.1+cu117


      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Sep 05 '23 18:09 teknium1

I turned the "raise error" to a warning but looks like it's not enough. Constructing the CUDAExtension with pytorch already requires CUDA_HOME. Let me think about it more.

Sep 05 '23 18:09 tridao

I think @tridao fixed this in 0c04943. Can you try pip install -U flash-attn now?

Collecting flash-attn
  Downloading flash_attn-2.1.2.post3.tar.gz (2.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 20.2 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [20 lines of output]
      fatal: not a git repository (or any of the parent directories): .git
      /tmp/pip-install-lfqw4621/flash-attn_3a5a1bca3d3d4043b1c739513830dd1f/setup.py:72: UserWarning: flash_attn was requested, but nvcc was not found.  Are you sure your environment has nvcc available?  If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.
        warnings.warn(
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-lfqw4621/flash-attn_3a5a1bca3d3d4043b1c739513830dd1f/setup.py", line 126, in <module>
          CUDAExtension(
        File "/home/teknium/qlora_sweep_3/finetune-study/axolotl/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1048, in CUDAExtension
          library_dirs += library_paths(cuda=True)
        File "/home/teknium/qlora_sweep_3/finetune-study/axolotl/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1179, in library_paths
          if (not os.path.exists(_join_cuda_home(lib_dir)) and
        File "/home/teknium/qlora_sweep_3/finetune-study/axolotl/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2223, in _join_cuda_home
          raise EnvironmentError('CUDA_HOME environment variable is not set. '
      OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.


      torch.__version__  = 2.0.1+cu117


      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

I fixed this by re-installing cuda using the package manager - https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu

Sep 09 '23 21:09 devxpy

I think @tridao fixed this in 0c04943. Can you try pip install -U flash-attn now?

Collecting flash-attn
  Downloading flash_attn-2.1.2.post3.tar.gz (2.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 20.2 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [20 lines of output]
      fatal: not a git repository (or any of the parent directories): .git
      /tmp/pip-install-lfqw4621/flash-attn_3a5a1bca3d3d4043b1c739513830dd1f/setup.py:72: UserWarning: flash_attn was requested, but nvcc was not found.  Are you sure your environment has nvcc available?  If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.
        warnings.warn(
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-lfqw4621/flash-attn_3a5a1bca3d3d4043b1c739513830dd1f/setup.py", line 126, in <module>
          CUDAExtension(
        File "/home/teknium/qlora_sweep_3/finetune-study/axolotl/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1048, in CUDAExtension
          library_dirs += library_paths(cuda=True)
        File "/home/teknium/qlora_sweep_3/finetune-study/axolotl/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1179, in library_paths
          if (not os.path.exists(_join_cuda_home(lib_dir)) and
        File "/home/teknium/qlora_sweep_3/finetune-study/axolotl/venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2223, in _join_cuda_home
          raise EnvironmentError('CUDA_HOME environment variable is not set. '
      OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.


      torch.__version__  = 2.0.1+cu117


      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

I fixed this by re-installing cuda using the package manager - https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu

I'm on WSL2 none of this stuff works lol

Sep 23 '23 10:09 teknium1

I have the same issue, trying to follow the https://github.com/microsoft/LLaVA-Med installation instructions.

Nov 13 '23 22:11 krishnaadithya

Hi all, encountering the same issue and I see its still open. Any leads?

Nov 30 '23 11:11 gisturiz

Same issue here. This is still broken.

Dec 17 '23 16:12 ejkitchen

Same issue here

Dec 25 '23 06:12 zafe312

Just reporting this is still happening. And the workaround still works.

Feb 10 '24 16:02 geronimo

After setting the environment var

export FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE

I am facing the following error:

TypeError: expected string or bytes-like object, got 'NoneType'

Mar 05 '24 21:03 luillyfe

@luillyfe try:

git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
export FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE
python setup.py install

Mar 06 '24 22:03 mitch-mckenzie

I see. The current setup.py might still require nvcc, I'll figure out how to fix later. As a work around for now you can try FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE pip install flash-attn --no-build-isolation

Thaaanks !

Jun 17 '24 17:06 arthurcornelio88

@tridao What does setting the FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE actually does?

Jul 12 '24 14:07 iamsiddhantsahu

@luillyfe try:

git clone https://github.com/Dao-AILab/flash-attention.git

cd flash-attention

export FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE

python setup.py install

it works!

Sep 04 '24 14:09 Imatgay

flash-attention flash-attention copied to clipboard

I thought nvcc wasnt required @ 2.0.7 but is again?

flash-attention
flash-attention copied to clipboard