vllm [Installation]: flash-attention internal "git submodule update" problematic for offline-install

[Installation]: flash-attention internal "git submodule update" problematic for offline-install

Open hpcpony opened this issue 1 week ago • 0 comments

Your current environment

N/A

How you are installing vllm

pip install .

I was building vllm off-line with clones of CUTLASS and flash-attention. flash-attention (setup.py) does a "git submodule update" to populate the CUTLASS include files it needs. This is problematic for an off-line install. It would be better if it payed attention to VLLM_CUTLASS_SRC_DIR or something like that.

A simple work around is to just copy the CUTLASS include tree into flash-attention csrc/cutlass subdirectory before building vllm.

Before submitting a new issue...

[x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Feb 17 '25 20:02 hpcpony

vllm vllm copied to clipboard

[Installation]: flash-attention internal "git submodule update" problematic for offline-install

Your current environment

How you are installing vllm

Before submitting a new issue...

vllm
vllm copied to clipboard