xformers
xformers copied to clipboard
on Jetson ORIN, Memory-efficient attention, SwiGLU, sparse and more won't be available.
🐛 Bug
I was trying to run mistral-7b on Jeston ORIN and built triton (openAI) and xformers from source.
However, when trying to run mistral-7b, I got following errors:
python -m main demo mistral-7B-v0.1/
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.0a0+41361538.nv23.06 with CUDA 1104 (you have 2.1.0a0+41361538.nv23.06)
Python 3.8.10 (you have 3.8.10)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
raise NotImplementedError(msg)
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
query : shape=(1, 27, 32, 128) (torch.float16)
key : shape=(1, 27, 32, 128) (torch.float16)
value : shape=(1, 27, 32, 128) (torch.float16)
attn_bias : <class 'xformers.ops.fmha.attn_bias.BlockDiagonalCausalLocalAttentionMask'>
p : 0.0
`decoderF` is not supported because:
xFormers wasn't build with CUDA support
attn_bias type is <class 'xformers.ops.fmha.attn_bias.BlockDiagonalCausalLocalAttentionMask'>
operator wasn't built - see `python -m xformers.info` for more info
`[email protected]` is not supported because:
xFormers wasn't build with CUDA support
`tritonflashattF` is not supported because:
xFormers wasn't build with CUDA support
attn_bias type is <class 'xformers.ops.fmha.attn_bias.BlockDiagonalCausalLocalAttentionMask'>
operator wasn't built - see `python -m xformers.info` for more info
triton is not available
Only work on pre-MLIR triton for now
`cutlassF` is not supported because:
xFormers wasn't build with CUDA support
operator wasn't built - see `python -m xformers.info` for more info
`smallkF` is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
xFormers wasn't build with CUDA support
dtype=torch.float16 (supported: {torch.float32})
attn_bias type is <class 'xformers.ops.fmha.attn_bias.BlockDiagonalCausalLocalAttentionMask'>
operator wasn't built - see `python -m xformers.info` for more info
unsupported embed per head: 128
Command
python3 -m xformers.info
I got
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.0a0+41361538.nv23.06 with CUDA 1104 (you have 2.1.0a0+41361538.nv23.06)
Python 3.8.10 (you have 3.8.10)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
xFormers 0.0.24+40d3967.d20231209
memory_efficient_attention.cutlassF: unavailable
memory_efficient_attention.cutlassB: unavailable
memory_efficient_attention.decoderF: unavailable
[email protected]: available
[email protected]: available
memory_efficient_attention.smallkF: unavailable
memory_efficient_attention.smallkB: unavailable
memory_efficient_attention.tritonflashattF: unavailable
memory_efficient_attention.tritonflashattB: unavailable
memory_efficient_attention.triton_splitKF: available
indexing.scaled_index_addF: available
indexing.scaled_index_addB: available
indexing.index_select: available
swiglu.dual_gemm_silu: unavailable
swiglu.gemm_fused_operand_sum: unavailable
swiglu.fused.p.cpp: not built
is_triton_available: True
pytorch.version: 2.1.0a0+41361538.nv23.06
pytorch.cuda: available
gpu.compute_capability: 8.7
gpu.name: Orin
build.info: available
build.cuda_version: 1104
build.python_version: 3.8.10
build.torch_version: 2.1.0a0+41361538.nv23.06
build.env.TORCH_CUDA_ARCH_LIST: None
build.env.XFORMERS_BUILD_TYPE: None
build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None
build.env.NVCC_FLAGS: None
build.env.XFORMERS_PACKAGE_FROM: None
source.privacy: open source
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Environment
Please copy and paste the output from the environment collection script from PyTorch (or fill out the checklist below manually).
You can run the script with:
# For security purposes, please check the contents of collect_env.py before running it.
python -m torch.utils.collect_env
- PyTorch Version (e.g., 1.0):
- OS (e.g., Linux): Linux (Jetson ORIN)
- How you installed PyTorch (
conda
,pip
, source): n/a - Build command you used (if compiling from source): python setup.py install
- Python version: 3.8
- CUDA/cuDNN version: 1104
- GPU models and configuration: Jetson ORIN iGPU
- Any other relevant information: