unsloth
unsloth copied to clipboard
Installation fails on H100 with flash-attention undefined symbol error
Environment:
- GPU: NVIDIA H100 (Ampere architecture)
- Python: 3.11.11
- CUDA: 12.4
- PyTorch: 2.5.1
- OS: Linux
Steps to reproduce:
- Created fresh Python environment
- Installed PyTorch 2.5.1 with CUDA 12.4 support
- Installed unsloth using:
pip install "unsloth[cu124-ampere-torch251] @ git+https://github.com/unslothai/unsloth.git"
Error: When trying to import FastLanguageModel, getting the following error: /data/unsloth/.venv/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZNK3c1011StorageImpl27throw_data_ptr_access_errorEv
The error seems to be related to flash-attention not being properly compiled or linked with the correct PyTorch version. The undefined symbol suggests there might be an ABI compatibility issue between the flash-attention binary and PyTorch 2.5.1.
Would appreciate any guidance on how to resolve this for H100 GPUs. Let me know if you need any additional information!
After much stumbling I used a Docker and reverse engineered from there the exact versions that don't lead to errors when running finetune on my H100 VM in GCP
Credit: https://hub.docker.com/r/geunhongmin/unsloth-docker
#!/bin/bash
set -e # Exit immediately if a command exits with a non-zero status
set -o pipefail # Catch errors in pipelines
# Define the virtual environment directory
VENV_DIR=".venv"
# Remove any existing virtual environment
if [ -d "$VENV_DIR" ]; then
echo "Removing existing virtual environment..."
rm -rf "$VENV_DIR"
fi
# Create a new virtual environment using uv
echo "Creating a new virtual environment..."
uv venv --python=3.10 "$VENV_DIR"
# Activate the virtual environment
source "$VENV_DIR/bin/activate"
# Verify Python version
python -V
# Install PyTorch with CUDA support
echo "Installing PyTorch..."
uv pip install torch==2.5.1 --index-url https://download.pytorch.org/whl/cu121
# Verify PyTorch installation
python -c "import torch; print(f'PyTorch: {torch.__version__}\nCUDA available: {torch.cuda.is_available()}\nCUDA version: {torch.version.cuda}')"
# Install additional dependencies
echo "Installing additional dependencies..."
uv pip install numpy accelerate==1.4.0 bitsandbytes==0.45.3 xformers==0.0.28.post3 peft==0.14.0 trl==0.15.2
# Install Unsloth from GitHub
echo "Installing Unsloth..."
uv pip install git+https://github.com/unslothai/unsloth.git
# Ensure setuptools is installed to prevent import errors
uv pip install setuptools
# Install Unsloth Zoo
uv pip install unsloth_zoo==2025.2.7
# Verify Unsloth installation
python -c "from unsloth import FastLanguageModel; print('Unsloth import successful!')"
echo "Installation complete!"
Try uninstalling flash-attn! It's optional!
Final note: if you need to install flash attention, you'll probably need to compile it. Here's how to do it:
pip install einops ninja
git clone https://github.com/Dao-AILab/flash-attention && cd flash-attention && python setup.py install # only if you want flash attention. This takes a bit of time to compile.
Will close this for now.