llm.c
llm.c copied to clipboard
the provided PTX was compiled with an unsupported toolchain.
Does anyone know the required CUDA version to compile the train_gpt2cu? I am getting the below error :
[GPT-2]
max_seq_len: 1024
vocab_size: 50257
num_layers: 12
num_heads: 12
channels: 768
num_parameters: 124439808
train dataset num_batches: 74
val dataset num_batches: 8
batch size: 4
sequence length: 1024
val_num_batches: 10
num_activations: 2456637440
[CUDA ERROR] at file train_gpt2.cu:341:
the provided PTX was compiled with an unsupported toolchain.
my cuda version is cuda12.1. thanks. or any recommendation to get over this? thanks.
What driver version are you running?
I might have a simple solution for anyone else who runs into this error.
Notice how the cuda compilation tools version is 11.5 and the cuda version is 12.2. The cuda version is not what is giving errors but rather the compilation tools. I completely removed cuda from my machine and reinstalled it (https://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_network).
In order:
- Remove/purge cuda https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#removing-cuda-toolkit-and-driver
-
sudo apt install build-essential dkms
-
sudo apt purge nvidia*
-
sudo apt autoremove
- Make sure you get the settings right https://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_network
-
sudo apt install nvidia-cuda-toolkit
- try running
nvcc --version
andnvidia-smi
- If both previous cmds worked, compile the gpt2 forward kernel with
make
and run the binary
DISCLAIMER:
- be careful when running these commands if have important things going on at the hardware level
- I'm not at all a CUDA/hardware specialist so if this solution seems messy, let me know so I can improve next time
EDIT Your installation might be bloated or you might be mixing versions. Scroll the bottom here https://forums.developer.nvidia.com/t/installing-the-latest-nvidia-drivers-cuda-and-cudnn-in-ubuntu-22-04-lts/278487 someone had a similar case.
I have the same, but I'm in WSL, so I just updated the Windows NVIDIA driver process since WSL uses the Windows NVIDIA driver process.