llm.c
llm.c copied to clipboard
`make` fails to autodetect GPU compute capability
Running make (e.g., make test_gpt2
) on my PC outputs the following:
make: __nvcc_device_query: No such file or directory
"Detected GPU compute capability: "
---------------------------------------------
→ cuDNN is manually disabled by default, run make with `USE_CUDNN=1` to try to enable
✓ OpenMP found
✓ OpenMPI found, OK to train with multiple GPUs
✓ nvcc found, including GPU/CUDA support
---------------------------------------------
cc -Ofast -Wno-unused-result -Wno-ignored-pragmas -Wno-unknown-attributes -march=native -fopenmp -DOMP test_gpt2.c -lm -lgomp -o test_gpt2
Although my PC has RTX 4090 and, as can be seen, nvcc
is found. I have already found a solution which relies on nvidia-smi
rather than __nvcc_device_query
(which suspiciously looks like something an intentionally hidden/temporary file) and the problem is gone. With this change, make
stops complaining about __nvcc_device_query
:
---------------------------------------------
→ cuDNN is manually disabled by default, run make with `USE_CUDNN=1` to try to enable
✓ OpenMP found
✓ OpenMPI found, OK to train with multiple GPUs
✓ nvcc found, including GPU/CUDA support
---------------------------------------------
cc -Ofast -Wno-unused-result -Wno-ignored-pragmas -Wno-unknown-attributes -march=native -fopenmp -DOMP test_gpt2.c -lm -lgomp -o test_gpt2
@akulchik - what toolkit version are you using and what OS?
Just did a check on an older 11.7 Cuda SDK and the file is there. I think your installation might have a problem. I do like your change but not sure it it's urgent unless it's critical that we support the older SDKs with the auto-detect. Can you try either reinstalling or using the latest 12.4.1 SDK? The file is supposed to be installed with nvcc.
Hey @akulchik are you still having problems with this?