packages [Request] llama.cpp-*-git

Package:

https://aur.archlinux.org/pkgbase/llama.cpp-git

Purpose:

llama.cpp is a port of Facebook's llama model in C/C++, and supports running many large language models (llms).

This pkgbase builds the following accelerated llama.cpp packages:

    llama.cpp-git
    llama.cpp-cublas-git
    llama.cpp-clblas-git
    llama.cpp-hipblas-git
    llama.cpp-sycl-f16-git
    llama.cpp-sycl-f32-git
    llama.cpp-vulkan-git

Benefits:

These packages contain gpu accelerated builds, which can speed up runtime by a multiple

Building:

The git release is useful for following upstream, as the non-git versions are often outdated and (at least amds) often lag behind/are unusable

Copyright:

MIT

Expected Interest:

Many

Already available?

No

Unique request?

Yes

Banned package?

No

More information:

No response

Jun 12 '24 16:06 jlo62

Can be added when it compiles successfully: llama.cpp-git.log

Jun 13 '24 18:06 Technetium1

Looks like a library that no other package currently uses.

Jun 13 '24 20:06 xiota

@xiota runs as an interactive program as well as a server.

The AUR package may need to be updated due to https://github.com/ggerganov/llama.cpp/pull/7809

Jun 13 '24 23:06 Technetium1

I've created a set of packages that you might be interested in including.
These should all build correctly in a chroot since I use pkgctl build to test.

llama.cpp
llama.cpp-vulkan
llama.cpp-sycl-fp16
llama.cpp-sycl-fp32
llama.cpp-opencl
llama.cpp-cuda
llama.cpp-hip

Also, should I be posting this here or on gitlab?

Oct 27 '24 09:10 txtsd

@txtsd Here is okay. Why did you make each package separate instead of part of the same pkgbuild, like the git package?

Oct 27 '24 15:10 xiota

instead of part of the same pkgbuild, like the git package?

This is the main reason I'd like to see the builds here, building all 7 if I only need one is annoying.

@txtsd :+1:

Oct 27 '24 16:10 jlo62

@jlo62 How long do builds take (individual and combined)? How big are the packages? How important is each (rank order)?

Oct 27 '24 16:10 xiota

@xiota

Maintenance burden is too high if it's all in one package. If one fails to build, I have to restart building without cache all over again (pkgctl build).
Dependencies are too large (25GB+). I don't have enough space on my / to install deps for all at once.
With pkgctl build, copying deps to the chroot as part of the build process takes way too long.
Easier for people who don't use chaotic-aur.

So I split it up into separate packages.

I don't remember how long they take to build, but the -hip and -cuda take the longest.

Packages sizes:

λ e llama.cpp*/*.pkg.tar.zst
.rw-r--r-- 4.5M txtsd txtsd 26 Oct 21:05 llama.cpp/llama.cpp-b3982-1-x86_64.pkg.tar.zst
.rw-r--r--  92M txtsd txtsd 27 Oct 04:05 llama.cpp-cuda/llama.cpp-cuda-b3982-2-x86_64.pkg.tar.zst
.rw-r--r--  84M txtsd txtsd 27 Oct 04:06 llama.cpp-cuda/llama.cpp-cuda-debug-b3982-2-x86_64.pkg.tar.zst
.rw-r--r--  84M txtsd txtsd 26 Oct 21:06 llama.cpp/llama.cpp-debug-b3982-1-x86_64.pkg.tar.zst
.rw-r--r--  11M txtsd txtsd 27 Oct 01:19 llama.cpp-hip/llama.cpp-opencl-b3982-2-x86_64.pkg.tar.zst
.rw-r--r-- 4.5M txtsd txtsd 27 Oct 01:34 llama.cpp-opencl/llama.cpp-opencl-b3982-2-x86_64.pkg.tar.zst
.rw-r--r--  43M txtsd txtsd 27 Oct 01:19 llama.cpp-hip/llama.cpp-opencl-debug-b3982-2-x86_64.pkg.tar.zst
.rw-r--r--  84M txtsd txtsd 27 Oct 01:35 llama.cpp-opencl/llama.cpp-opencl-debug-b3982-2-x86_64.pkg.tar.zst
.rw-r--r-- 6.6M txtsd txtsd 26 Oct 23:39 llama.cpp-sycl-f16/llama.cpp-sycl-f16-b3982-1-x86_64.pkg.tar.zst
.rw-r--r--  45M txtsd txtsd 26 Oct 23:40 llama.cpp-sycl-f16/llama.cpp-sycl-f16-debug-b3982-1-x86_64.pkg.tar.zst
.rw-r--r-- 6.6M txtsd txtsd 26 Oct 22:18 llama.cpp-sycl-f32/llama.cpp-sycl-f32-b3982-1-x86_64.pkg.tar.zst
.rw-r--r--  45M txtsd txtsd 26 Oct 22:19 llama.cpp-sycl-f32/llama.cpp-sycl-f32-debug-b3982-1-x86_64.pkg.tar.zst
.rw-r--r-- 5.0M txtsd txtsd 27 Oct 01:38 llama.cpp-vulkan/llama.cpp-vulkan-b3982-1-x86_64.pkg.tar.zst
.rw-r--r--  88M txtsd txtsd 27 Oct 01:39 llama.cpp-vulkan/llama.cpp-vulkan-debug-b3982-1-x86_64.pkg.tar.zst

Build folder sizes:

λ du -h --summarize llama.cpp*
1.4G	llama.cpp
798M	llama.cpp-cuda
446M	llama.cpp-hip
1.4G	llama.cpp-opencl
1.3G	llama.cpp-sycl-f16
1.4G	llama.cpp-sycl-f32
1.5G	llama.cpp-vulkan

Oct 27 '24 17:10 txtsd

@txtsd Thank you for the info. I'll add one at a time and let you know if I run into any problems.

Oct 27 '24 17:10 xiota

@jlo62 I'm switching this request to the stable version because maintainer is accessible and responsive.

Oct 27 '24 17:10 xiota

@dr460nf1r3 Problem with . in package name? https://gitlab.com/chaotic-aur/pkgbuilds/-/pipelines/1515413743

Oct 27 '24 17:10 xiota

Package names can contain only alphanumeric characters and any of @, ., _, +, -. Names are not allowed to start with hyphens or dots. All letters should be lowercase.

From: https://wiki.archlinux.org/title/Arch_package_guidelines#Package_naming

Should be accounted for if that's what it is :eyes:

Oct 27 '24 17:10 txtsd

@txtsd It's a problem with builder, not packaging. Someone who isn't me (or you) has to look at it.

Oct 27 '24 17:10 xiota

I know. I was just clarifying :smile_cat:

Oct 27 '24 18:10 txtsd

@dr460nf1r3 Problem with . in package name? https://gitlab.com/chaotic-aur/pkgbuilds/-/pipelines/1515413743

This didn't detect any changes in the folder at all 🤔 the detection checks each folder besides the ones starting with a dot for any changes.

Edit: you were likely right: https://github.com/chaotic-cx/chaotic-repository-template/blob/main/.ci%2Fon-commit.sh#L85 😳 gotta extend the regex here.

Oct 28 '24 06:10 dr460nf1r3

https://github.com/chaotic-cx/chaotic-repository-template/commit/a1eadf243ca5f29dc5c47b72c234d2cd297cbeab

Oct 28 '24 19:10 dr460nf1r3

Not sure that regex is working. See https://gitlab.com/chaotic-aur/pkgbuilds/-/pipelines/1517191414

Does bash regex recognize \w? This doesn't work in shell.

if [[ "a" =~ ^([\w]+) ]]; then
  echo true
fi

Maybe something like this would work:

local _chars='a-z0-9@_+'
if [[ "$file" =~ ^([${_chars}][${_chars}\.-]*)/ ]]; then
  : do stuff
fi

Oct 28 '24 22:10 xiota

Not sure that regex is working. See https://gitlab.com/chaotic-aur/pkgbuilds/-/pipelines/1517191414

Does bash regex recognize \w? This doesn't work in shell.
if [[ "a" =~ ^([\w]+) ]]; then
  echo true
fi
Maybe something like this would work:
local _chars='a-z0-9@_+'
if [[ "$file" =~ ^([${_chars}][${_chars}\.-]*)/ ]]; then
  : do stuff
fi

I tested it via regex101 and \w is a quite common key.. hm. I now wonder what flavor of regex bash uses 🤔 If you want, you can PR the change to the template repo. Will be working today but can merge stuff. What is the reason for choosing am @ rather than the escaped dot though?

Edit: indeed, it does not exist. Reference: https://en.m.wikibooks.org/wiki/Regular_Expressions/POSIX-Extended_Regular_Expressions

Oct 29 '24 06:10 dr460nf1r3

Is there a gitlab instance or open a PR at github.com/chaotic-cx/chaotic-repository-template ?

Oct 29 '24 17:10 xiota

It's the GitHub repo only. Though I already amended my commit.

Oct 30 '24 05:10 dr460nf1r3

Added interfere (571518779fa3ad6fca467ec3cba0e1e3fe6a9686) to use libggml-git. Needed to avoid conflict with whisper.cpp. Using custom package because AUR package has some problems, like illegal instruction error.

No point adding other variants because the difference seems to be from how they build libggml.

@txtsd llama.cpp, built from AUR, produces illegal instruction errors because the default libggml options indirectly use march=native (by detecting cpu features and enabling them individually). See custom libggml-git for options to build for baseline x86_64.

Nov 06 '24 22:11 xiota

@xiota Just to be sure, you're asking me to:

Build llama.cpp* in a generic manner by using the build options from libggml-git
Remove the built libggml and use libggml-git as a dependency to avoid a conflict with whisper.cpp

Is that correct?

Nov 07 '24 00:11 txtsd

@txtsd

The interfere fixes the issues with llama.cpp as far as chaotic-aur repo is concerned.

Would be best practice to make llama.cpp work with baseline x86_64, using the build options from the custom libggml-git package in this repo.
Whether to remove the libggml files after build is up to you. I personally would not unless a user asks. However, since the variants seem to work by changing libggml build parameters, maybe there should be only one version of llama.cpp with multiple versions of libggml. (Same may apply to whisper.cpp.)

You could also try linking libggml statically so that there are no conflicting files in the final package. This might require me to rewrite the interfere.

Nov 07 '24 01:11 xiota

@xiota I made the llama.cpp* packages use baseline x86_64 using the custom libggm-git package as a reference. Please see if it's sufficient, and let me know.

I'm not yet sure how to link libggml statically. I tried toggling the GGML_STATIC flag, but that just builds and bundles a libggml.a. I'm not sure if that's correct.

Nov 09 '24 06:11 txtsd

@txtsd I don't have hardware to test all the configurations, but the changes look okay to me. ``llama.cpp` built successfully.

I haven't tested with these specific packages, but usually, after static linking, .a, headers, and other related files can be deleted.

Nov 09 '24 15:11 xiota

I will investigate further, thanks!

Nov 10 '24 08:11 txtsd

@txtsd I took a look at whisper.cpp (related to libggml dep), and I think static builds would probably be best. If static build cannot be done, could install to alternate location (/usr/lib/llama.cpp) and set rpath.

If you want help, you can add me as comaintainer of llama.cpp. Once that one is working, it should be usable as a model for the others.

Nov 10 '24 20:11 xiota

@xiota Thanks for your offer of help. I made all the packages build statically.

You were right, removing the .a files was the way to go. I was under the impression makepkg handles that internally with the default !staticlibs option.

I tested with a model and prompt to make sure llama.cpp worked before I ported the changes to the other packages.

Nov 11 '24 09:11 txtsd

We might have a big problem with statically built libggml in llama.cpp (at least for CUDA)

See: https://aur.archlinux.org/packages/llama.cpp-cuda#comment-998638

Nov 15 '24 10:11 txtsd

@txtsd In that case... Try building with shared library, but install to /usr/lib/llama-cpp. Use patchelf to set rpath. Symlink binaries.

Nov 15 '24 16:11 xiota