llama
llama copied to clipboard
AMD GPU's
So are people with AMD GPU's screwed? I literally just sold my nvidia card and a Radeon two days ago. I've been trying my hardest to get this damn thing to run, but no matter what I try on Windows, or Linux (xubuntu to be more specific) it always seems to come back to a cuda issue. SO before I waste more of my time trying desperately to make this work, is there any tools that will allow an AMD card to be used, or how do I bypass it and just run it off my CPU? Any help would be great.
some more specs of mine just in case Ryzen 5 5600 Radeon 6500 32 GB Ram
Check out the library: torch_directml
DirectML is a Windows library that should support AMD as well as NVidia on Windows.
It looks like there might be a bit of work converting it to using DirectML instead of CUDA. Specifically the parallel library doesn't look like it supports DirectML, so this might have to be ripped out and just be satisfied with running this on a single GPU.
Another way is if someone converted the model to Onnx and used Onnxruntime with the DirectML provider.
In conclusion, there are a variety of ways to get it to work but they require some coding. I hope someone will make a fork to support DirectML as I'm not sure quite how to get it right at the moment.
(BTW yes there are also cpu only forks but that seems a waste of your graphics cards!)
I even checked the increase in vram usage, but I couldn't even check the text generation because the vram on my graphics card is 8GB, which is not suitable for running this. https://github.com/lshqqytiger/llama-directml
So are people with AMD GPU's screwed? I literally just sold my nvidia card and a Radeon two days ago. I've been trying my hardest to get this damn thing to run, but no matter what I try on Windows, or Linux (xubuntu to be more specific) it always seems to come back to a cuda issue. SO before I waste more of my time trying desperately to make this work, is there any tools that will allow an AMD card to be used, or how do I bypass it and just run it off my CPU? Any help would be great.
some more specs of mine just in case Ryzen 5 5600 Radeon 6500 32 GB Ram
works perfectly fine for me on linux with a 6900xt, https://github.com/oobabooga/text-generation-webui also makes it really easy
If you can't get LLaMA to work, try this: https://youtu.be/Bj4erD5NNa0
check this
TORCH_COMMAND='pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/rocm5.1.1' python launch.py --precision full --no-half
rocm - is compiler "cuda to opencl", for amd gpu
maybe it is working, or maybe it simple way to run llama on amd gpu
cc @jeffdaily for viz. Closing this issue but it would be great to have more programmatic AMD support for Llama 1/2.