[ci] add LLAMA_CURL flags to the prebuilt binaries
At least a non-negligible number of projects are directly built on top of the prebuilt binaries in the llama.cpp releases.
These binaries currently do not come with support for CURL. Hence, someone using the binaries would need to add their own logic to download and cache model checkpoints (for example: https://github.com/pinokiocomputer/llamanet/issues/1). Given we already support this in llama.cpp, it'd lead to a better overall developer experience if we ship with the LLAMA_CURL flag ON by default. (This behavior will be similar to what we already do for the Mac prebuilt binaries)
Note: I've currently added the flag to all build statements, feel free to point to those that don't make sense.
P.S. I wasn't able to test this build on a CI before making the PR.
I think there is a problem with that because this would link libcurl dynamically, so if it is not installed on the target system, then the binaries won't work
Good point! I think for Windows from Windows 10 onward they already ship cURL. The tricky bit would be ubuntu.
In general, from my understanding people who consume the binaries would have cURL already installed (It is a pretty basic utility IMO).
But, maybe we target the most common builds first i.e. Windows + Linux + Mac (and any other you suggest) - switch for maybe a week and if there's a lot of issues about the same we revert? (I can help monitoring the issues)
WDYT?
@ggerganov I Just discussed this with my colleagues and realised that you were talking about the library libcurl and not curl the executable. Is there a reason why we dynamically compile vs static in this case? I do think having an easy way to download checkpoints across prebuilt binaries (or at least a big chunk of them would be good DX)
Perhaps we build these with LLAMA_STATIC? would that work? (This would increase the size of the binaries considerably tho)
I think it might be possible, but someone has to see what is the tradeoff to link libcurl statically (in terms of size, portability, etc). I just don't know how big is the baggage from this dependency.
For example, here is a comparison of ldd ./bin/main with and without libcurl:
# LLAMA_CURL=OFF
ldd ./bin/main
linux-vdso.so.1 (0x00007ffdc8ff0000)
libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 (0x00007b6bb0b1c000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007b6bb0400000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007b6bb0719000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007b6bb0afc000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007b6bb0000000)
/lib64/ld-linux-x86-64.so.2 (0x00007b6bb0b7e000)
# LLAMA_CURL=ON
ldd ./bin/main
linux-vdso.so.1 (0x00007ffe5ab76000)
libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f7a0f93d000)
libcurl.so.4 => /lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f7a0f559000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f7a0f200000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7a0f472000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7a0f91d000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7a0ee00000)
libnghttp2.so.14 => /lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007f7a0f448000)
libidn2.so.0 => /lib/x86_64-linux-gnu/libidn2.so.0 (0x00007f7a0f1df000)
librtmp.so.1 => /lib/x86_64-linux-gnu/librtmp.so.1 (0x00007f7a0f1c0000)
libssh.so.4 => /lib/x86_64-linux-gnu/libssh.so.4 (0x00007f7a0f153000)
libpsl.so.5 => /lib/x86_64-linux-gnu/libpsl.so.5 (0x00007f7a0f434000)
libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007f7a0f0af000)
libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007f7a0e800000)
libgssapi_krb5.so.2 => /lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007f7a0f05b000)
libldap-2.5.so.0 => /lib/x86_64-linux-gnu/libldap-2.5.so.0 (0x00007f7a0eda1000)
liblber-2.5.so.0 => /lib/x86_64-linux-gnu/liblber-2.5.so.0 (0x00007f7a0f04a000)
libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f7a0ecd2000)
libbrotlidec.so.1 => /lib/x86_64-linux-gnu/libbrotlidec.so.1 (0x00007f7a0f03c000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f7a0ecb6000)
/lib64/ld-linux-x86-64.so.2 (0x00007f7a0f99f000)
libunistring.so.2 => /lib/x86_64-linux-gnu/libunistring.so.2 (0x00007f7a0e656000)
libgnutls.so.30 => /lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007f7a0e46b000)
libhogweed.so.6 => /lib/x86_64-linux-gnu/libhogweed.so.6 (0x00007f7a0ec6e000)
libnettle.so.8 => /lib/x86_64-linux-gnu/libnettle.so.8 (0x00007f7a0e425000)
libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f7a0e3a3000)
libkrb5.so.3 => /lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007f7a0e2d8000)
libk5crypto.so.3 => /lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007f7a0e2a9000)
libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007f7a0f90f000)
libkrb5support.so.0 => /lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007f7a0f02e000)
libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007f7a0ec53000)
libbrotlicommon.so.1 => /lib/x86_64-linux-gnu/libbrotlicommon.so.1 (0x00007f7a0e286000)
libp11-kit.so.0 => /lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007f7a0e14b000)
libtasn1.so.6 => /lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007f7a0e133000)
libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007f7a0f42d000)
libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f7a0e11f000)
libffi.so.8 => /lib/x86_64-linux-gnu/libffi.so.8 (0x00007f7a0ec46000)
@Vaibhavs10 @ggerganov one idea maybe to include both the original version and the libcurl version, like:
- llama-b3091-bin-win-cuda-cu12.2.0-x64.zip
- llama-b3091-bin-win-cuda-cu12.2.0-x64-libcurl.zip
one idea maybe to include both the original version and the libcurl version, like:
I like this idea; however, it will require twice the number of binaries we ship. WDYT, @ggerganov?
btw, statically linking it is not a good idea primarily because the end-user should be aware of what they are running and what they are running it.
To limit the scope at first maybe we only add curl binding for windows (cpu + cuda) + linux (cpu + cuda) (since Mac already has it) at first and then see where to go next?
Not really sure. Probably we have to figure out how to statically link libcurl and then maybe extend the release workflow to compile a minimal version of libcurl from source that has only the features used by llama.cpp and link that into the binaries. This should be the most efficient and portable way I think
Related to this, I really want to build portable binaries for all flavors of Linux CentOS/Ubuntu/etc. Do the statically built binaries work well for these use cases?
@Vaibhavs10 @ggerganov @vignesh1507 did we ever fix the ship libcurl on Windows problem in the end?