llama.cpp
llama.cpp copied to clipboard
ci: add linux binaries to release build
adds ubuntu20.04 binaries to the releases. and also cublas linux builds.
I changed the path for where the dynamic library is put. It was in the cmake build directory before, now its next to the executables (down into bin/).
I always build shared, with relative rpath (so no export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:.
for libllama.so)
Distributing the lib makes the life for wrappers (eg python libs) easier.
not in release yet:
- avx512 (i can't test this)
- openblas (system lib, maybe ship?)
- clbast (system lib, maybe ship?)
example release: https://github.com/Green-Sky/llama.cpp/releases/tag/ci_cublas_linux-99b7d15
I think we should just add GNUInstallDirs in the CMakeLists.txt, that way distributors can configure the paths they want to install the files. cmake --install
will also do strip
and RPATH config, also.
I think we should just add GNUInstallDirs in the CMakeLists.txt, that way distributors can configure the paths they want to install the files.
cmake --install
will also dostrip
and RPATH config, also.
will look into that tomorrow
update: The cuda toolkit install now nukes the github action runners. they use too much disk space.
Maybe we can keep just one CUDA version?
Maybe we can keep just one CUDA version?
I think 1 runner does 1 job at a time. So I don't think that would make a difference. Going to play around with selective installs again <.<
also, I once before saw a gh workflow where some non essentials where deleted, to make some space...
OK, managed to download everything and even run main but I get this error:
CUDA error 222 at /home/runner/work/llama.cpp/llama.cpp/ggml-cuda.cu:1244: the provided PTX was compiled with an unsupported toolchain.
This is with release fafc8ae and the CUDA 12 version. The machine also had 12 and 2080 Ti.
@SlyEcho btw, I switched to the "networked" installer, which is just setting up an apt repo ... but that works for us.
OK, managed to download everything and even run main but I get this error:
CUDA error 222 at /home/runner/work/llama.cpp/llama.cpp/ggml-cuda.cu:1244: the provided PTX was compiled with an unsupported toolchain.
This is with release fafc8ae and the CUDA 12 version. The machine also had 12 and 2080 Ti.
this looks very weird, no idea what is happening here. since I don't use nvprune
, I thought it just works. I can run the cuda11.7 just fine on my system. My driver is too old for 12...
Isn’t there an image with CUDA already installed?
I plan to go with that approach for ROCm.
Image, hmm. installing cuda now only takes as long as the compile ~1min. so i dont really see the point of using docker (im assuming thats what you mean with image)
https://github.com/Jimver/cuda-toolkit/issues/249
this made installing not the full toolkit viable (without me manually installing the apt sources :smile: )
Yeah, I meant Docker. AMD publishes their images with everything installed already. Although I don’t know if it’s possible to redistribute some of those runtime components
It would be cool for windows build, those take for ever. but for linux builds is now <50% of total build time.
there is still the problem of distributing the binaries NOT every release, those uploads now take up a significant amount of time (comparatively)