llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

[ci] add LLAMA_CURL flags to the prebuilt binaries

Open Vaibhavs10 opened this issue 1 year ago • 8 comments

At least a non-negligible number of projects are directly built on top of the prebuilt binaries in the llama.cpp releases.

These binaries currently do not come with support for CURL. Hence, someone using the binaries would need to add their own logic to download and cache model checkpoints (for example: https://github.com/pinokiocomputer/llamanet/issues/1). Given we already support this in llama.cpp, it'd lead to a better overall developer experience if we ship with the LLAMA_CURL flag ON by default. (This behavior will be similar to what we already do for the Mac prebuilt binaries)

Note: I've currently added the flag to all build statements, feel free to point to those that don't make sense.

P.S. I wasn't able to test this build on a CI before making the PR.

Vaibhavs10 avatar Jun 04 '24 16:06 Vaibhavs10

I think there is a problem with that because this would link libcurl dynamically, so if it is not installed on the target system, then the binaries won't work

ggerganov avatar Jun 04 '24 18:06 ggerganov

Good point! I think for Windows from Windows 10 onward they already ship cURL. The tricky bit would be ubuntu.

In general, from my understanding people who consume the binaries would have cURL already installed (It is a pretty basic utility IMO).

But, maybe we target the most common builds first i.e. Windows + Linux + Mac (and any other you suggest) - switch for maybe a week and if there's a lot of issues about the same we revert? (I can help monitoring the issues)

WDYT?

Vaibhavs10 avatar Jun 04 '24 20:06 Vaibhavs10

@ggerganov I Just discussed this with my colleagues and realised that you were talking about the library libcurl and not curl the executable. Is there a reason why we dynamically compile vs static in this case? I do think having an easy way to download checkpoints across prebuilt binaries (or at least a big chunk of them would be good DX)

Perhaps we build these with LLAMA_STATIC? would that work? (This would increase the size of the binaries considerably tho)

Vaibhavs10 avatar Jun 05 '24 07:06 Vaibhavs10

I think it might be possible, but someone has to see what is the tradeoff to link libcurl statically (in terms of size, portability, etc). I just don't know how big is the baggage from this dependency.

For example, here is a comparison of ldd ./bin/main with and without libcurl:

# LLAMA_CURL=OFF
ldd ./bin/main 
	linux-vdso.so.1 (0x00007ffdc8ff0000)
	libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 (0x00007b6bb0b1c000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007b6bb0400000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007b6bb0719000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007b6bb0afc000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007b6bb0000000)
	/lib64/ld-linux-x86-64.so.2 (0x00007b6bb0b7e000)

# LLAMA_CURL=ON
ldd ./bin/main 
	linux-vdso.so.1 (0x00007ffe5ab76000)
	libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f7a0f93d000)
	libcurl.so.4 => /lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f7a0f559000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f7a0f200000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7a0f472000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7a0f91d000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7a0ee00000)
	libnghttp2.so.14 => /lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007f7a0f448000)
	libidn2.so.0 => /lib/x86_64-linux-gnu/libidn2.so.0 (0x00007f7a0f1df000)
	librtmp.so.1 => /lib/x86_64-linux-gnu/librtmp.so.1 (0x00007f7a0f1c0000)
	libssh.so.4 => /lib/x86_64-linux-gnu/libssh.so.4 (0x00007f7a0f153000)
	libpsl.so.5 => /lib/x86_64-linux-gnu/libpsl.so.5 (0x00007f7a0f434000)
	libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007f7a0f0af000)
	libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007f7a0e800000)
	libgssapi_krb5.so.2 => /lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007f7a0f05b000)
	libldap-2.5.so.0 => /lib/x86_64-linux-gnu/libldap-2.5.so.0 (0x00007f7a0eda1000)
	liblber-2.5.so.0 => /lib/x86_64-linux-gnu/liblber-2.5.so.0 (0x00007f7a0f04a000)
	libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f7a0ecd2000)
	libbrotlidec.so.1 => /lib/x86_64-linux-gnu/libbrotlidec.so.1 (0x00007f7a0f03c000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f7a0ecb6000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f7a0f99f000)
	libunistring.so.2 => /lib/x86_64-linux-gnu/libunistring.so.2 (0x00007f7a0e656000)
	libgnutls.so.30 => /lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007f7a0e46b000)
	libhogweed.so.6 => /lib/x86_64-linux-gnu/libhogweed.so.6 (0x00007f7a0ec6e000)
	libnettle.so.8 => /lib/x86_64-linux-gnu/libnettle.so.8 (0x00007f7a0e425000)
	libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f7a0e3a3000)
	libkrb5.so.3 => /lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007f7a0e2d8000)
	libk5crypto.so.3 => /lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007f7a0e2a9000)
	libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007f7a0f90f000)
	libkrb5support.so.0 => /lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007f7a0f02e000)
	libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007f7a0ec53000)
	libbrotlicommon.so.1 => /lib/x86_64-linux-gnu/libbrotlicommon.so.1 (0x00007f7a0e286000)
	libp11-kit.so.0 => /lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007f7a0e14b000)
	libtasn1.so.6 => /lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007f7a0e133000)
	libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007f7a0f42d000)
	libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f7a0e11f000)
	libffi.so.8 => /lib/x86_64-linux-gnu/libffi.so.8 (0x00007f7a0ec46000)

ggerganov avatar Jun 05 '24 07:06 ggerganov

@Vaibhavs10 @ggerganov one idea maybe to include both the original version and the libcurl version, like:

  • llama-b3091-bin-win-cuda-cu12.2.0-x64.zip
  • llama-b3091-bin-win-cuda-cu12.2.0-x64-libcurl.zip

cocktailpeanut avatar Jun 05 '24 13:06 cocktailpeanut

one idea maybe to include both the original version and the libcurl version, like:

I like this idea; however, it will require twice the number of binaries we ship. WDYT, @ggerganov?

btw, statically linking it is not a good idea primarily because the end-user should be aware of what they are running and what they are running it.

To limit the scope at first maybe we only add curl binding for windows (cpu + cuda) + linux (cpu + cuda) (since Mac already has it) at first and then see where to go next?

Vaibhavs10 avatar Jun 06 '24 09:06 Vaibhavs10

Not really sure. Probably we have to figure out how to statically link libcurl and then maybe extend the release workflow to compile a minimal version of libcurl from source that has only the features used by llama.cpp and link that into the binaries. This should be the most efficient and portable way I think

ggerganov avatar Jun 07 '24 12:06 ggerganov

Related to this, I really want to build portable binaries for all flavors of Linux CentOS/Ubuntu/etc. Do the statically built binaries work well for these use cases?

ericcurtin avatar Jul 25 '24 00:07 ericcurtin

@Vaibhavs10 @ggerganov @vignesh1507 did we ever fix the ship libcurl on Windows problem in the end?

ericcurtin avatar Sep 19 '25 11:09 ericcurtin