whisper.cpp
whisper.cpp copied to clipboard
Link errors using Golang bindings with WHISPER_CUBLAS=1
I've been working on a Go service that lives inside a Docker container and I have a working setup for this: building the whisper.cpp library and integrating it into my application worked fine.
I've been trying to extend this to use CUDA so I changed my make command to WHISPER_CUBLAS=1 make whisper -j and now the whisper.cpp lib still builds successfully but - when building the Go application that uses it - I get a long list of linker errors around undefined CUDA references.
I've tried adding extra paths to CGO_LDFLAGS (checked that these are correct for the NVIDIA container) but no change. Is linking against the CUDA libs something that could work with CGO? Is there something I need to change with the way that the whisper.cpp library is being linked?
Here is my current Dockerfile:
FROM nvidia/cuda:11.8.0-devel-ubuntu22.04 AS build
RUN apt-get update
RUN apt-get install -y wget build-essential libprotobuf-dev protobuf-compiler protobuf-compiler-grpc golang-google-protobuf-dev grpc-proto protobuf-compiler-grpc
# Install Go tools
WORKDIR /tmp
RUN wget https://go.dev/dl/go1.21.4.linux-amd64.tar.gz
RUN tar -xzf go1.21.4.linux-amd64.tar.gz
ARG PATH=$PATH:/tmp/go/bin
ARG GOPATH=/tmp/go
RUN go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
RUN go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
WORKDIR /build
COPY . .
# Build the protobuf Go files
WORKDIR /build/proto
RUN protoc -I. -I/usr/include/google/protobuf --go_out=. --go_opt=paths=source_relative --go-grpc_out=. --go-grpc_opt=paths=source_relative whisper.proto
# Build the whisper.cpp library
WORKDIR /build/whispercpp/bindings/go
RUN WHISPER_CUBLAS=1 make whisper -j
# Build the server application
WORKDIR /build
ARG CGO_ENABLED=1
ARG CGO_CFLAGS=-I/build/whispercpp -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include
ARG CGO_CXXFLAGS=-I/build/whispercpp -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include
ARG CGO_LDFLAGS=-L/build/whispercpp -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L/usr/local/cuda/lib64 -L/opt/cuda/lib64 -L/targets/x86_64-linux/lib -L/usr/local/nvidia/lib -L/usr/local/nvidia/lib64 -L/usr/local/cuda/extras/CUPTI/lib64
RUN go build -o wspr_rpcsvc
# Prepare a fresh, minimal image without source files and with CUDA support
FROM nvidia/cuda:11.8.0-base-ubuntu22.04 AS runner
# Install the NVIDIA Container Toolkit (CUDA via Docker support)
WORKDIR /
COPY --from=build /build/resources /resources
COPY --from=build /build/wspr_rpcsvc /wspr_rpcsvc
CMD ["/wspr_rpcsvc"]
Linker Errors
...
6.805 /usr/bin/ld: tmpxft_00000032_00000000-6_ggml-cuda.compute_90.cudafe1.cpp:(.text._Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PKcPKfS4_PfllllRKP11CUstream_st[_Z27ggml_cuda_op_mul_mat_cublasPK11ggml_tensorS1_PS_PKcPKfS4_PfllllRKP11CUstream_st]+0x8c0): undefined reference to
`cudaGetDevice'
6.805 /usr/bin/ld: /build/whispercpp/libwhisper.a(ggml-cuda.o): in function `__sti____cudaRegisterAll()':
6.805 tmpxft_00000032_00000000-6_ggml-cuda.compute_90.cudafe1.cpp:(.text.startup+0xd): undefined reference to `__cudaRegisterFatBinary'
6.805 /usr/bin/ld: tmpxft_00000032_00000000-6_ggml-cuda.compute_90.cudafe1.cpp:(.text.startup+0x41): undefined reference to `__cudaRegisterFunction'
6.805 /usr/bin/ld: tmpxft_00000032_00000000-6_ggml-cuda.compute_90.cudafe1.cpp:(.text.startup+0x6f): undefined reference to `__cudaRegisterFunction'
6.805 /usr/bin/ld: tmpxft_00000032_00000000-6_ggml-cuda.compute_90.cudafe1.cpp:(.text.startup+0x9d): undefined reference to `__cudaRegisterFunction'
6.805 /usr/bin/ld: tmpxft_00000032_00000000-6_ggml-cuda.compute_90.cudafe1.cpp:(.text.startup+0xcb): undefined reference to `__cudaRegisterFunction'
6.805 /usr/bin/ld: tmpxft_00000032_00000000-6_ggml-cuda.compute_90.cudafe1.cpp:(.text.startup+0xf9): undefined reference to `__cudaRegisterFunction'
6.805 /usr/bin/ld: /build/whispercpp/libwhisper.a(ggml-cuda.o):tmpxft_00000032_00000000-6_ggml-cuda.compute_90.cudafe1.cpp:(.text.startup+0x127): more undefined references to `__cudaRegisterFunction' follow
6.805 /usr/bin/ld: /build/whispercpp/libwhisper.a(ggml-cuda.o): in function `__sti____cudaRegisterAll()':
6.805 tmpxft_00000032_00000000-6_ggml-cuda.compute_90.cudafe1.cpp:(.text.startup+0x12d3): undefined reference to `__cudaRegisterFatBinaryEnd'
6.805 collect2: error: ld returned 1 exit status
6.805
------
Dockerfile:32
--------------------
30 | ARG CGO_CXXFLAGS=-I/build/whispercpp -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include
31 | ARG CGO_LDFLAGS=-L/build/whispercpp -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L/usr/local/cuda/lib64 -L/opt/cuda/lib64 -L/targets/x86_64-linux/lib -L/usr/local/nvidia/lib -L/usr/local/nvidia/lib64 -L/usr/local/cuda/extras/CUPTI/lib64
32 | >>> RUN go build -o wspr_rpcsvc
33 |
34 | # Prepare a fresh, minimal image without source files and with CUDA support
--------------------
ERROR: failed to solve: process "/bin/sh -c go build -o wspr_rpcsvc" did not complete successfully: exit code: 1
The same kind of issue with OPENBLAS, too:
# Build the whisper.cpp library
WORKDIR /build/whispercpp/bindings/go
RUN WHISPER_OPENBLAS=1 make whisper -j
# Build the server application
WORKDIR /build
ARG CGO_ENABLED=1
ARG CGO_CFLAGS=-I/build/whispercpp -I/usr/local/include/openblas -I/usr/include/openblas
ARG CGO_CXXFLAGS=-I/build/ whispercpp -I/usr/local/include/openblas -I/usr/include/openblas
ARG CGO_LDFLAGS=-L/build/whispercpp -lopenblas
RUN go build -o wspr_rpcsvc
results in:
6.325 # poll_busily/whispernet/server
6.325 /tmp/go/pkg/tool/linux_amd64/link: running gcc failed: exit status 1
6.325 /usr/bin/ld: /build/whispercpp/libwhisper.a(ggml.o): in function `ggml_compute_forward_mul_mat':
6.325 ggml.c:(.text+0x11e63): undefined reference to `cblas_sgemm'
6.325 collect2: error: ld returned 1 exit status
If this affects anyone else then I found a workaround by building the shared library instead:
FROM nvidia/cuda:12.2.2-devel-ubuntu22.04 AS build
RUN apt-get update
RUN apt-get install -y wget build-essential libprotobuf-dev protobuf-compiler protobuf-compiler-grpc golang-google-protobuf-dev grpc-proto protobuf-compiler-grpc
# Install Go tools
WORKDIR /tmp
RUN wget https://go.dev/dl/go1.21.4.linux-amd64.tar.gz
RUN tar -xzf go1.21.4.linux-amd64.tar.gz
ARG PATH=$PATH:/tmp/go/bin
ARG GOPATH=/tmp/go
RUN go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
RUN go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
WORKDIR /build
COPY . .
# Build the protobuf Go files
WORKDIR /build/proto
RUN protoc -I. -I/usr/include/google/protobuf --go_out=. --go_opt=paths=source_relative --go-grpc_out=. --go-grpc_opt=paths=source_relative whisper.proto
# Build whisper shared library
WORKDIR /build/whispercpp
RUN WHISPER_CUBLAS=1 make libwhisper.so -j
# Build the server application
WORKDIR /build
ARG CGO_ENABLED=1
ARG CGO_CFLAGS=-I/build/whispercpp -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include
ARG CGO_CXXFLAGS=-I/build/whispercpp -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include
ARG CGO_LDFLAGS=-L/build/whispercpp -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L/usr/local/cuda/lib64 -L/opt/cuda/lib64 -L/targets/x86_64-linux/lib -L/usr/local/nvidia/lib -L/usr/local/nvidia/lib64
RUN go build -o wspr_rpcsvc
# Prepare a fresh, minimal image without source files and with CUDA support
FROM nvidia/cuda:12.2.2-devel-ubuntu22.04 AS runner
WORKDIR /
COPY --from=build /build/resources /resources
COPY --from=build /build/wspr_rpcsvc /wspr_rpcsvc
COPY --from=build /build/whispercpp/libwhisper.so /usr/lib/x86_64-linux-gnu/libwhisper.so
RUN ldconfig
CMD ["/wspr_rpcsvc"]
for reference, I had the same issue here and I was not able to build libwhisper.so, I've solved it with adding -L$(CUDA_PATH)/stubs -lcuda to the whisper LDFLAGS and in my CGO_LDFLAGS when building the golang binary with libwhisper.a (not the .so).
I've traced it back by trying to find the symbols that it complains about:
nm -D /usr/local/cuda/targets/x86_64-linux/lib/*.so | grep cuMem
<no results>
And instead present in the stubs directory:
root@76d08ab315dc:/build# nm -D /usr/local/cuda/targets/x86_64-linux/lib/stubs/*.so | grep cuMem
0000000000008000 T cuMemAddressFree
0000000000007ff0 T cuMemAddressReserve
00000000000081d0 T cuMemAdvise
0000000000008f30 T cuMemAlloc
0000000000009710 T cuMemAllocAsync
00000000000080e0 T cuMemAllocAsync_ptsz
0000000000009720 T cuMemAllocFromPoolAsync
0000000000008160 T cuMemAllocFromPoolAsync_ptsz
0000000000008f70 T cuMemAllocHost
0000000000007bf0 T cuMemAllocHost_v2
0000000000007c40 T cuMemAllocManaged
0000000000008f40 T cuMemAllocPitch
... (and many others)
See also: https://github.com/NVIDIA/nvidia-docker/issues/508
I am building against whisper.cpp in my Go program using a CI worker that doesn't have a graphics card. So naturally, my cuda-toolkit relies on stub libraries.
After banging my head on this for a while (and not being able to get libwhisper.a to link up properly), I was able to link against libwhisper.so by adding the following to my CGO_LDFLAGS:
-Wl,-rpath-link,/usr/local/cuda/lib64/stubs -L/usr/local/cuda/lib64
(I also had to make a change in whisper.cpp's Makefile to link against the stubs, as indicated in #1973)
Hopefully this is helpful to someone else.