axolotl icon indicating copy to clipboard operation
axolotl copied to clipboard

FP16/BF16 support on AMD

Open fakerybakery opened this issue 1 year ago • 6 comments

Hi, Might it be possible to add FP16/BF16 support on AMD? Thank you!

fakerybakery avatar Dec 16 '23 02:12 fakerybakery

soon 😉

winglian avatar Dec 22 '23 16:12 winglian

Thanks! Is there any timeline on AMD support?

fakerybakery avatar Jan 09 '24 02:01 fakerybakery

I should have asked you sooner: What issues are you experiencing when trying to run with AMD? Is it on Windows, or is it not supported by ROCm?

As far as I know, only xformers, flash attention, and maybe sample packing (since it might require masking from one of those two) don't work. Apart from that, I think everything else works fine assuming your GPU is compatible with ROCm and you have enough VRAM.

This is what I use to install axolotl. I have ROCm 5.7.3 installed, and a gfx1100 GPU. You can adjust it accordingly.

git clone https://github.com/OpenAccess-AI-Collective/axolotl
cd axolotl
python3 -m venv venv
source venv/bin/activate
pip install -e .

pip uninstall -y torch xformers bitsandbytes
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/rocm5.7

cd venv
git clone https://github.com/arlo-phoenix/bitsandbytes-rocm-5.6.git bitsandbytes
cd bitsandbytes
export ROCM_HOME=/opt/rocm-5.7.3
make hip ROCM_TARGET=gfx1100
pip install .
cd ..
cd ..

xzuyn avatar Jan 23 '24 18:01 xzuyn

Hi, I got an error about bf16 not being supported on the GPU. Maybe this is something with the GPU itself? It's a rx6800

fakerybakery avatar Jan 24 '24 16:01 fakerybakery

I should have asked you sooner: What issues are you experiencing when trying to run with AMD? Is it on Windows, or is it not supported by ROCm?

As far as I know, only xformers, flash attention, and maybe sample packing (since it might require masking from one of those two) don't work. Apart from that, I think everything else works fine assuming your GPU is compatible with ROCm and you have enough VRAM.

This is what I use to install axolotl. I have ROCm 5.7.3 installed, and a gfx1100 GPU. You can adjust it accordingly.

git clone https://github.com/OpenAccess-AI-Collective/axolotl
cd axolotl
python3 -m venv venv
source venv/bin/activate
pip install -e .

pip uninstall -y torch xformers bitsandbytes
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/rocm5.7

cd venv
git clone https://github.com/arlo-phoenix/bitsandbytes-rocm-5.6.git bitsandbytes
cd bitsandbytes
export ROCM_HOME=/opt/rocm-5.7.3
make hip ROCM_TARGET=gfx1100
pip install .
cd ..
cd ..

Couldn't reproduce this with rocm-6.0.0 and gfx100 (7900XTX). Bitsandbytes had errors (even trying to do a simple import on interactive python) and axolotl would not run anything.

j-dominguez9 avatar Jan 24 '24 16:01 j-dominguez9

Couldn't reproduce this with rocm-6.0.0 and gfx100 (7900XTX). Bitsandbytes had errors (even trying to do a simple import on interactive python) and axolotl would not run anything.

I downgraded from ROCm 6 because I had issues. 5.7.3 was the latest version of 5.7, and it's been working for me.

xzuyn avatar Jan 25 '24 03:01 xzuyn