llama.cpp
llama.cpp copied to clipboard
I get the following error when compiling with make LLAMA_CUBLAS=1 :
I get the following error when compiling with make LLAMA_CUBLAS=1 : make LLAMA_CUBLAS=1 LDFLAGS=-L/usr/local/cuda-11.6/targets/x86_64-linux/lib I llama.cpp build info: I UNAME_S: Linux I UNAME_P: x86_64 I UNAME_M: x86_64 I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include I LDFLAGS: -L/usr/local/cuda-11.6/targets/x86_64-linux/lib I CC: cc (Ubuntu 8.4.0-1ubuntu1~18.04) 8.4.0 I CXX: g++ (Ubuntu 8.4.0-1ubuntu1~18.04) 8.4.0
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include examples/main/main.cpp ggml.o llama.o common.o ggml-cuda.o -o main -L/usr/local/cuda-11.6/targets/x86_64-linux/lib
ggml-cuda.o: In function mul_f32(float const*, float const*, float*, int, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xcd): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x10b): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
convert_fp16_to_fp32_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1ab): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x245): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x28f): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
dequantize_row_q8_0_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x31b): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3b5): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3ff): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
dequantize_row_q5_1_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x48b): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x525): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x56f): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
dequantize_row_q5_0_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x5fb): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x695): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x6df): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
dequantize_row_q4_1_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x76b): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x805): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x84f): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
dequantize_row_q4_0_cuda(void const*, float*, int, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x8db): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x975): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x9bf): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
ggml_cuda_pool_malloc(unsigned long, unsigned long*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xa11): undefined reference to cudaGetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xaab): undefined reference to
cudaMalloc'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xac2): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xb04): undefined reference to
cudaGetErrorString'
ggml-cuda.o: In function ggml_cuda_h2d_tensor_2d(void*, ggml_tensor const*, long, long, long, long, CUstream_st*)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xc55): undefined reference to
cudaMemcpy2DAsync'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xcbc): undefined reference to cudaMemcpy2DAsync' ggml-cuda.o: In function
ggml_cuda_pool_free(void*, unsigned long)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xd41): undefined reference to cudaGetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xde9): undefined reference to
cudaFree'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xdff): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xe40): undefined reference to
cudaGetErrorString'
ggml-cuda.o: In function ggml_cuda_h2d_tensor_2d(void*, ggml_tensor const*, long, long, long, long, CUstream_st*) [clone .constprop.19]': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xf75): undefined reference to
cudaMemcpy2DAsync'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xfdb): undefined reference to cudaMemcpy2DAsync' ggml-cuda.o: In function
ggml_cuda_op(ggml_tensor const*, ggml_tensor const*, ggml_tensor*, void ()(ggml_tensor const, ggml_tensor const*, ggml_tensor*, char*, float*, float*, float*, long, long, int, CUstream_st*&), bool)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x14c0): undefined reference to cudaSetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x185b): undefined reference to
cudaEventRecord'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x18f0): undefined reference to cudaGetLastError' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x190c): undefined reference to
cudaStreamWaitEvent'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x19a3): undefined reference to cudaMemcpyAsync' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1b0c): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1b7e): undefined reference to cudaMemcpyAsync' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1dd4): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1e1b): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1e32): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1e4c): undefined reference to cudaSetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1e59): undefined reference to
cudaDeviceSynchronize'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1f52): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1f6c): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1f85): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1fa8): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1fc1): undefined reference to cudaGetErrorString' ggml-cuda.o:tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x1fe3): more undefined references to
cudaGetErrorString' follow
ggml-cuda.o: In function void dequantize_mul_mat_vec<1, 1, &(convert_f16(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x212b): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2169): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
void dequantize_mul_mat_vec<32, 2, &(dequantize_q4_0(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x224b): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2289): undefined reference to
cudaLaunchKernel'
ggml-cuda.o: In function void dequantize_mul_mat_vec<32, 2, &(dequantize_q4_1(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x236b): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x23a9): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
void dequantize_mul_mat_vec<32, 2, &(dequantize_q5_0(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x248b): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x24c9): undefined reference to
cudaLaunchKernel'
ggml-cuda.o: In function void dequantize_mul_mat_vec<32, 2, &(dequantize_q5_1(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x25ab): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x25e9): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
void dequantize_mul_mat_vec<32, 1, &(dequantize_q8_0(void const*, int, int, float&, float&))>(void const*, float const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x26cb): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2709): undefined reference to
cudaLaunchKernel'
ggml-cuda.o: In function void dequantize_block<32, 1, &(dequantize_q8_0(void const*, int, int, float&, float&))>(void const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x27b3): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x27f6): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
void dequantize_block<32, 2, &(dequantize_q5_0(void const*, int, int, float&, float&))>(void const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x28a3): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x28e6): undefined reference to
cudaLaunchKernel'
ggml-cuda.o: In function void dequantize_block<32, 2, &(dequantize_q5_1(void const*, int, int, float&, float&))>(void const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2993): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x29d6): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
void dequantize_block<32, 2, &(dequantize_q4_0(void const*, int, int, float&, float&))>(void const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2a83): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2ac6): undefined reference to
cudaLaunchKernel'
ggml-cuda.o: In function void dequantize_block<32, 2, &(dequantize_q4_1(void const*, int, int, float&, float&))>(void const*, float*, int)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2b73): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2bb6): undefined reference to cudaLaunchKernel' ggml-cuda.o: In function
void dequantize_block<1, 1, &(convert_f16(void const*, int, int, float&, float&))>(void const*, float*, int)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2c63): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2ca6): undefined reference to
cudaLaunchKernel'
ggml-cuda.o: In function ggml_init_cublas': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2d3b): undefined reference to
cudaGetDeviceCount'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2dae): undefined reference to cudaGetDeviceProperties' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2f47): undefined reference to
cudaSetDevice'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2f6a): undefined reference to cudaStreamCreateWithFlags' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2f80): undefined reference to
cudaStreamCreateWithFlags'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2fa9): undefined reference to cudaEventCreateWithFlags' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2fc6): undefined reference to
cublasCreate_v2'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x2fe5): undefined reference to cublasSetMathMode' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x304d): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3099): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x30c2): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x30dc): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x30f3): undefined reference to
cublasGetStatusString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x311b): undefined reference to cublasGetStatusString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3148): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3193): undefined reference to cudaGetErrorString' ggml-cuda.o: In function
ggml_cuda_host_malloc':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x349e): undefined reference to cudaMallocHost' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x34cb): undefined reference to
cudaGetErrorString'
ggml-cuda.o: In function ggml_cuda_host_free': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3542): undefined reference to
cudaFreeHost'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3551): undefined reference to cudaGetErrorString' ggml-cuda.o: In function
ggml_cuda_load_data':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x387f): undefined reference to cudaSetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x38bb): undefined reference to
cudaMalloc'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3919): undefined reference to cudaMemcpy' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x391e): undefined reference to
cudaDeviceSynchronize'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3997): undefined reference to cudaSetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3a3f): undefined reference to
cudaGetErrorString'
ggml-cuda.o: In function ggml_cuda_free_data': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3b23): undefined reference to
cudaSetDevice'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3b34): undefined reference to cudaFree' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3b64): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x3ba7): undefined reference to cudaGetErrorString' ggml-cuda.o: In function
ggml_cuda_compute_forward':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x410b): undefined reference to cudaMemcpyAsync' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4124): undefined reference to
cublasSetStream_v2'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4184): undefined reference to cublasGemmEx' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x41c0): undefined reference to
cudaMemcpyAsync'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4263): undefined reference to cudaDeviceSynchronize' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4342): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4385): undefined reference to cudaGetErrorString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x43a2): undefined reference to
cublasGetStatusString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x43c7): undefined reference to cublasGetStatusString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x43f0): undefined reference to
cudaGetErrorString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x4405): undefined reference to cudaGetErrorString' ggml-cuda.o: In function
ggml_cuda_h2d_tensor_2d(void*, ggml_tensor const*, long, long, long, long, CUstream_st*)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xcf7): undefined reference to cudaMemcpyAsync' ggml-cuda.o: In function
__cudaUnregisterBinaryUtil()':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0xe78): undefined reference to __cudaUnregisterFatBinary' ggml-cuda.o: In function
ggml_cuda_h2d_tensor_2d(void*, ggml_tensor const*, long, long, long, long, CUstream_st*) [clone .constprop.19]':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text+0x101a): undefined reference to cudaMemcpyAsync' ggml-cuda.o: In function
ggml_cuda_op_mul_mat_cublas(ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char*, float*, float*, float*, long, long, int, CUstream_st*&)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x84): undefined reference to cudaGetDevice' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0xa9): undefined reference to
cublasSetStream_v2'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0xed): undefined reference to cublasSgemm_v2' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x158): undefined reference to
cublasGetStatusString'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x19a): undefined reference to cublasGetStatusString' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x1b9): undefined reference to
cudaGetErrorString'
ggml-cuda.o: In function ggml_cuda_op_dequantize_mul_mat_vec(ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char*, float*, float*, float*, long, long, int, CUstream_st*&)': tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0xe5): undefined reference to
__cudaPushCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0xf9): undefined reference to cudaGetLastError' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x185): undefined reference to
__cudaPushCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x22c): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x27d): undefined reference to
cudaLaunchKernel'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x2e5): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x38c): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x3dd): undefined reference to cudaLaunchKernel' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x445): undefined reference to
__cudaPushCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x4ec): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x53d): undefined reference to
cudaLaunchKernel'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x5a5): undefined reference to __cudaPushCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x64c): undefined reference to
__cudaPopCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x69d): undefined reference to cudaLaunchKernel' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x705): undefined reference to
__cudaPushCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x7ac): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x7fd): undefined reference to
cudaLaunchKernel'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x8ab): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x8fc): undefined reference to
cudaLaunchKernel'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z35ggml_cuda_op_dequantize_mul_mat_vecPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x946): undefined reference to cudaGetErrorString' ggml-cuda.o: In function
ggml_cuda_op_mul(ggml_tensor const*, ggml_tensor const*, ggml_tensor*, char*, float*, float*, float*, long, long, int, CUstream_st*&)':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x111): undefined reference to cudaGetLastError' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x195): undefined reference to
__cudaPushCallConfiguration'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x269): undefined reference to __cudaPopCallConfiguration' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x2ba): undefined reference to
cudaLaunchKernel'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text._Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st[_Z16ggml_cuda_op_mulPK11ggml_tensorS1_PS_PcPfS4_S4_lliRP11CUstream_st]+0x2f9): undefined reference to cudaGetErrorString' ggml-cuda.o: In function
__sti____cudaRegisterAll()':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x9): undefined reference to __cudaRegisterFatBinary' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x3d): undefined reference to
__cudaRegisterFunction'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x6b): undefined reference to __cudaRegisterFunction' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x99): undefined reference to
__cudaRegisterFunction'
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0xc7): undefined reference to __cudaRegisterFunction' tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0xf5): undefined reference to
__cudaRegisterFunction'
ggml-cuda.o:tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x123): more undefined references to __cudaRegisterFunction' follow ggml-cuda.o: In function
__sti____cudaRegisterAll()':
tmpxft_000035bd_00000000-7_ggml-cuda.cudafe1.cpp:(.text.startup+0x275): undefined reference to `__cudaRegisterFatBinaryEnd'
collect2: error: ld returned 1 exit status
Makefile:251: recipe for target 'main' failed
make: *** [main] Error 1
make LLAMA_CUBLAS=1 LDFLAGS=-L/usr/local/cuda-11.6/targets/x86_64-linux/lib
It's not possible to override compilation flags like this with the current Makefile.
I suggest you use CMake instead to have more control over the build configuration.
In my case, I somehow had installed two CUDA versions (10.1
and 12.2
):
$ dpkg -l | grep cuda
ii cuda 12.2.0-1 amd64 CUDA meta-package
ii cuda-12-2 12.2.0-1 amd64 CUDA 12.2 meta-package
ii cuda-cccl-12-2 12.2.53-1 amd64 CUDA CCCL
ii cuda-command-line-tools-12-2 12.2.0-1 amd64 CUDA command-line tools
ii cuda-compiler-12-2 12.2.0-1 amd64 CUDA compiler
ii cuda-cudart-12-2 12.2.53-1 amd64 CUDA Runtime native Libraries
ii cuda-cudart-dev-12-2 12.2.53-1 amd64 CUDA Runtime native dev links, headers
ii cuda-cuobjdump-12-2 12.2.53-1 amd64 CUDA cuobjdump
ii cuda-cupti-12-2 12.2.60-1 amd64 CUDA profiling tools runtime libs.
ii cuda-cupti-dev-12-2 12.2.60-1 amd64 CUDA profiling tools interface.
ii cuda-cuxxfilt-12-2 12.2.53-1 amd64 CUDA cuxxfilt
ii cuda-demo-suite-12-2 12.2.53-1 amd64 Demo suite for CUDA
ii cuda-documentation-12-2 12.2.53-1 amd64 CUDA documentation
ii cuda-driver-dev-12-2 12.2.53-1 amd64 CUDA Driver native dev stub library
ii cuda-drivers 535.54.03-1 amd64 CUDA Driver meta-package, branch-agnostic
ii cuda-drivers-535 535.54.03-1 amd64 CUDA Driver meta-package, branch-specific
ii cuda-gdb-12-2 12.2.53-1 amd64 CUDA-GDB
ii cuda-keyring 1.1-1 all GPG keyring for the CUDA repository
ii cuda-libraries-12-2 12.2.0-1 amd64 CUDA Libraries 12.2 meta-package
ii cuda-libraries-dev-12-2 12.2.0-1 amd64 CUDA Libraries 12.2 development meta-package
ii cuda-nsight-12-2 12.2.53-1 amd64 CUDA nsight
ii cuda-nsight-compute-12-2 12.2.0-1 amd64 NVIDIA Nsight Compute
ii cuda-nsight-systems-12-2 12.2.0-1 amd64 NVIDIA Nsight Systems
ii cuda-nvcc-12-2 12.2.91-1 amd64 CUDA nvcc
ii cuda-nvdisasm-12-2 12.2.53-1 amd64 CUDA disassembler
ii cuda-nvml-dev-12-2 12.2.81-1 amd64 NVML native dev links, headers
ii cuda-nvprof-12-2 12.2.60-1 amd64 CUDA Profiler tools
ii cuda-nvprune-12-2 12.2.53-1 amd64 CUDA nvprune
ii cuda-nvrtc-12-2 12.2.91-1 amd64 NVRTC native runtime libraries
ii cuda-nvrtc-dev-12-2 12.2.91-1 amd64 NVRTC native dev links, headers
ii cuda-nvtx-12-2 12.2.53-1 amd64 NVIDIA Tools Extension
ii cuda-nvvp-12-2 12.2.60-1 amd64 CUDA Profiler tools
ii cuda-opencl-12-2 12.2.53-1 amd64 CUDA OpenCL native Libraries
ii cuda-opencl-dev-12-2 12.2.53-1 amd64 CUDA OpenCL native dev links, headers
ii cuda-profiler-api-12-2 12.2.53-1 amd64 CUDA Profiler API
ii cuda-repo-ubuntu2004-12-1-local 12.1.1-530.30.02-1 amd64 cuda repository configuration files
ii cuda-runtime-12-2 12.2.0-1 amd64 CUDA Runtime 12.2 meta-package
ii cuda-sanitizer-12-2 12.2.53-1 amd64 CUDA Sanitizer
ii cuda-toolkit-12-1-config-common 12.1.105-1 all Common config package for CUDA Toolkit 12.1.
ii cuda-toolkit-12-2 12.2.0-1 amd64 CUDA Toolkit 12.2 meta-package
ii cuda-toolkit-12-2-config-common 12.2.53-1 all Common config package for CUDA Toolkit 12.2.
ii cuda-toolkit-12-config-common 12.2.53-1 all Common config package for CUDA Toolkit 12.
ii cuda-toolkit-config-common 12.2.53-1 all Common config package for CUDA Toolkit.
ii cuda-tools-12-2 12.2.0-1 amd64 CUDA Tools meta-package
ii cuda-visual-tools-12-2 12.2.0-1 amd64 CUDA visual tools
ii libcudart10.1:amd64 10.1.243-3 amd64 NVIDIA CUDA Runtime Library
ii nvidia-cuda-dev 10.1.243-3 amd64 NVIDIA CUDA development files
ii nvidia-cuda-doc 10.1.243-3 all NVIDIA CUDA and OpenCL documentation
ii nvidia-cuda-gdb 10.1.243-3 amd64 NVIDIA CUDA Debugger (GDB)
ii nvidia-cuda-toolkit 10.1.243-3 amd64 NVIDIA CUDA development toolkit
I solved the problem by removing the last 5 packages in this list:
sudo apt remove libcudart10.1 nvidia-cuda-dev nvidia-cuda-doc nvidia-cuda-gdb nvidia-cuda-toolkit
sudo apt autoremove
...
# run cmake and specify the full path to nvcc
cmake .. -DLLAMA_CUBLAS=1 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.2/bin/nvcc
I don't guarantee this is the right thing to do! It just worked for me
In my case, I somehow had installed two CUDA versions (
10.1
and12.2
):$ dpkg -l | grep cuda ii cuda 12.2.0-1 amd64 CUDA meta-package ii cuda-12-2 12.2.0-1 amd64 CUDA 12.2 meta-package ii cuda-cccl-12-2 12.2.53-1 amd64 CUDA CCCL ii cuda-command-line-tools-12-2 12.2.0-1 amd64 CUDA command-line tools ii cuda-compiler-12-2 12.2.0-1 amd64 CUDA compiler ii cuda-cudart-12-2 12.2.53-1 amd64 CUDA Runtime native Libraries ii cuda-cudart-dev-12-2 12.2.53-1 amd64 CUDA Runtime native dev links, headers ii cuda-cuobjdump-12-2 12.2.53-1 amd64 CUDA cuobjdump ii cuda-cupti-12-2 12.2.60-1 amd64 CUDA profiling tools runtime libs. ii cuda-cupti-dev-12-2 12.2.60-1 amd64 CUDA profiling tools interface. ii cuda-cuxxfilt-12-2 12.2.53-1 amd64 CUDA cuxxfilt ii cuda-demo-suite-12-2 12.2.53-1 amd64 Demo suite for CUDA ii cuda-documentation-12-2 12.2.53-1 amd64 CUDA documentation ii cuda-driver-dev-12-2 12.2.53-1 amd64 CUDA Driver native dev stub library ii cuda-drivers 535.54.03-1 amd64 CUDA Driver meta-package, branch-agnostic ii cuda-drivers-535 535.54.03-1 amd64 CUDA Driver meta-package, branch-specific ii cuda-gdb-12-2 12.2.53-1 amd64 CUDA-GDB ii cuda-keyring 1.1-1 all GPG keyring for the CUDA repository ii cuda-libraries-12-2 12.2.0-1 amd64 CUDA Libraries 12.2 meta-package ii cuda-libraries-dev-12-2 12.2.0-1 amd64 CUDA Libraries 12.2 development meta-package ii cuda-nsight-12-2 12.2.53-1 amd64 CUDA nsight ii cuda-nsight-compute-12-2 12.2.0-1 amd64 NVIDIA Nsight Compute ii cuda-nsight-systems-12-2 12.2.0-1 amd64 NVIDIA Nsight Systems ii cuda-nvcc-12-2 12.2.91-1 amd64 CUDA nvcc ii cuda-nvdisasm-12-2 12.2.53-1 amd64 CUDA disassembler ii cuda-nvml-dev-12-2 12.2.81-1 amd64 NVML native dev links, headers ii cuda-nvprof-12-2 12.2.60-1 amd64 CUDA Profiler tools ii cuda-nvprune-12-2 12.2.53-1 amd64 CUDA nvprune ii cuda-nvrtc-12-2 12.2.91-1 amd64 NVRTC native runtime libraries ii cuda-nvrtc-dev-12-2 12.2.91-1 amd64 NVRTC native dev links, headers ii cuda-nvtx-12-2 12.2.53-1 amd64 NVIDIA Tools Extension ii cuda-nvvp-12-2 12.2.60-1 amd64 CUDA Profiler tools ii cuda-opencl-12-2 12.2.53-1 amd64 CUDA OpenCL native Libraries ii cuda-opencl-dev-12-2 12.2.53-1 amd64 CUDA OpenCL native dev links, headers ii cuda-profiler-api-12-2 12.2.53-1 amd64 CUDA Profiler API ii cuda-repo-ubuntu2004-12-1-local 12.1.1-530.30.02-1 amd64 cuda repository configuration files ii cuda-runtime-12-2 12.2.0-1 amd64 CUDA Runtime 12.2 meta-package ii cuda-sanitizer-12-2 12.2.53-1 amd64 CUDA Sanitizer ii cuda-toolkit-12-1-config-common 12.1.105-1 all Common config package for CUDA Toolkit 12.1. ii cuda-toolkit-12-2 12.2.0-1 amd64 CUDA Toolkit 12.2 meta-package ii cuda-toolkit-12-2-config-common 12.2.53-1 all Common config package for CUDA Toolkit 12.2. ii cuda-toolkit-12-config-common 12.2.53-1 all Common config package for CUDA Toolkit 12. ii cuda-toolkit-config-common 12.2.53-1 all Common config package for CUDA Toolkit. ii cuda-tools-12-2 12.2.0-1 amd64 CUDA Tools meta-package ii cuda-visual-tools-12-2 12.2.0-1 amd64 CUDA visual tools ii libcudart10.1:amd64 10.1.243-3 amd64 NVIDIA CUDA Runtime Library ii nvidia-cuda-dev 10.1.243-3 amd64 NVIDIA CUDA development files ii nvidia-cuda-doc 10.1.243-3 all NVIDIA CUDA and OpenCL documentation ii nvidia-cuda-gdb 10.1.243-3 amd64 NVIDIA CUDA Debugger (GDB) ii nvidia-cuda-toolkit 10.1.243-3 amd64 NVIDIA CUDA development toolkit
I solved the problem by removing the last 5 packages in this list:
sudo apt remove libcudart10.1 nvidia-cuda-dev nvidia-cuda-doc nvidia-cuda-gdb nvidia-cuda-toolkit sudo apt autoremove ... # run cmake and specify the full path to nvcc cmake .. -DLLAMA_CUBLAS=1 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.2/bin/nvcc
I don't guarantee this is the right thing to do! It just worked for me
That also worked for me. I had to run the following commands:
$ sudo apt remove libcudart11.0 nvidia-cuda-dev nvidia-cuda-gdb nvidia-cuda-toolkit nvidia-cuda-toolkit-doc $ sudo apt autoremove $ make
This issue was closed because it has been inactive for 14 days since being marked as stale.
I found this one, because the include folder may wrong. If upgrade to cuda 12 so that the include folder should change to the same version eg:
this one related to include folder for cmake -DCUDAToolkit_ROOT=/usr/local/cuda-12
cmake .. -DLLAMA_CUDA=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12/bin/nvcc -DCUDAToolkit_ROOT=/usr/local/cuda-12