librdkafka icon indicating copy to clipboard operation
librdkafka copied to clipboard

Use system-provided cyrus-sasl/libsasl2 at runtime

Open jgiannuzzi opened this issue 4 months ago • 9 comments

Motivation

Currently the Python bindings require the user to build from source in order to use the SASL GSSAPI mechanism on Linux. This is because PyPI doesn't allow Python wheels that link to non-core libraries, like libsasl2.

Other bindings that can use the builds linking to libsasl2 - like the .NET ones - do not work on Debian-based systems without adding a symlink from libsasl2.so.3 to libsasl2.so.2. This is because the builds are done on an RPM-based system, which have a different soname ABI policy.

Implementation

Instead of having multiple builds with and without libsasl2 on Linux, we use dlopen to load libsasl2 at runtime on Unix. The SASL GSSAPI mechanism availability is thus checked at runtime.

The differences between Debian-based and RPM-based distros are addressed by probing the various known names for libsasl2. The subset of the ABI we use has not changed between the upstream soname ABI bumps.

The documentation has been updated to remove the previous limitations around libsasl2/GSSAPI.

The CI and build scripts have been updated to only build one flavor per linux/libc, and the --disable-gssapi parameter has been removed.

The 2 build systems (mklove and cmake) have been updated to build SASL GSSAPI support based on the availability of libdl instead of libsasl2.

The librdkafka.redist nuget package has been updated to include only one build for linux/glibc/x64, as the centos8-librdkafka.so build is obsoleted by the single build with no dependencies.

The static library build now supports the SASL GSSAPI mechanism.

A very small subset of the libsasl2 header has been included in rdkafka_sasl_cyrus.c. The license can be found at https://github.com/cyrusimap/cyrus-sasl/blob/master/COPYING.

jgiannuzzi avatar Aug 14 '25 10:08 jgiannuzzi

:tada: All Contributor License Agreements have been signed. Ready to merge.
:white_check_mark: jgiannuzzi
Please push an empty commit if you would like to re-run the checks to verify CLA status for all contributors.

Hi @emasab, could you please review this PR?

jgiannuzzi avatar Aug 14 '25 13:08 jgiannuzzi

This is causing us some pain due to environment differences at the moment. Any chance this can get looked at soon?

feldoh avatar Oct 10 '25 11:10 feldoh

Hi @jgiannuzzi . Thanks for contributing to librdkafka! Our supported way of using libsasl2 is by linking librdkafka dynamically, so you install our librdkafka1 package for Debian or RH that depends on libsasl2 as well (https://packages.confluent.io/clients) and then do Python: pip install --no-binary=confluent-kafka Go: you use -tags dynamic .NET: you exclude the librdkafka.redist

<PackageReference Include="librdkafka.redist" Version="2.12.0" ExcludeAssets="All" />

JS: export CKJS_LINKING=dynamic

emasab avatar Nov 04 '25 16:11 emasab

Thanks for your comment @emasab! The method you described does not work for a user install sadly, and requires the additional ecosystem-specific instructions you mentioned. This is exactly what this PR is trying to solve: allowing the regular Python wheels / .NET nuget package / etc. to just work without having to install anything system-wide. Could you please consider the approach I suggest in this PR and let me know whether what you think about it?

jgiannuzzi avatar Nov 04 '25 16:11 jgiannuzzi

But you still have to install the libsasl2 package of your distribution. If it was everything included in the binary I'd agree. With Python you need a compiler to build the C extension, we've similar requirements for CGO.

does not work for a user install sadly

Is it because you're distributing an application that uses librdkafka and GSSAPI and want to simplify the installation procedure?

emasab avatar Nov 04 '25 16:11 emasab

I can say that we've been using this patch internally ourselves for a few months now and it has resulted in a tremendous amount of user gratitude for a number of reasons:

  • We used to have to have every lib that used kafka has a series of big red boxes at the top of docs to give people the special weird instructions needed to use kafka. These have now been deleted as there are no special instructions.
  • We used to have users doing pip install XYZ that relied on confluent kafka but because you can't express the binary requirements as proper dependencies this just resulted in these libs/tools being broken. If users actually read all the docs they'd generally complain but get it right, most users skip the docs and just try to pip install as works for almost all other tools and get sad. We've seen a measurable drop in support requests on this issue, actually it's been 0 in the last quarter.
  • We used to have lets call it late error states because someone's personal machine will have ubuntu but for various support or compatibility reasons the "prod" version will have redhat for example so their binary deps were wrong. You can reasonably argue this is a skill issue, however, most users that use kafka do not know about this level of system level detail, this is just a footgun and since adopting this patch we've gone from a few issues every few weeks to 0, the footgun is just gone, it just works.
  • Library providers would compile a binary wheel as we have secure build systems where compiling from source is disallowed or just extremely slow. That wheel would work fine for them but not for the secure build. This also naturally prevented us from using the confluent lib directly in the first place, so we have to do very careful alignment of base images, none of which is required at all post this change.

Side note, we tried baking in libsasl2 because our users were so frustrated with this and that went badly because people would set the SASL_PATH env var to whatever it was on their build machine (Jenkins) but the actual path on the prod image / machine would sometimes be different. This often gets caught in staging but in certain cases this has even caused prod incidents as certain kafka code paths escaped testing in an environment where the sasl path varied.

You do have to install libsasl but that can trivially be added to most base images without needing users to worry about specifics. I haven't seen anyone actually asking about it, and to install it just requires an apt install using all the standard unix dependency tooling without users needing to understand specifics of particular kafka libs.

I know it seems like a minor optimisation but it's hard to overstate just how much this has positively impacted peoples dev and release flows with Kafka. We're starting to see broader adoption among some groups that previously saw it as too much hassle because now it "just works" tm without people needing any special environment tuning which is always painful in locked down environments.

feldoh avatar Nov 04 '25 19:11 feldoh

I suppose what I'm saying is it doesn't strictly solve something that couldn't be worked around before but now there's no need for workarounds. It's just removing a few footguns and smoothing a few roads, much to the happiness of our users.

feldoh avatar Nov 04 '25 19:11 feldoh

I'd like to highlight another critical issue this PR resolves: version mismatches between the confluent-kafka Python package and the system-installed librdkafka (which lead to):

  • Lack of Reproducibility: When an application's dependencies are split between pip and a system package manager like apt, we lose reproducible builds. An application that works perfectly on a developer's machine can fail in a CI pipeline or production environment simply because the base image has a slightly different version of the system-installed librdkafka. This forces developers to debug the environment instead of their code.

  • Confusing Upgrade Path: The upgrade process is non-obvious and error-prone. A developer might upgrade the confluent-kafka Python package to get a new feature or bugfix, but see no change in behavior because the underlying C library they are actually using is the older, system-installed one. They have to remember to separately manage and upgrade the system package, which is an unintuitive and easily forgotten step.

This PR fixes these issues by aligning with modern packaging expectations. It ensures that installing the Python package brings along the exact C library it was built and tested with, making builds predictable and upgrades straightforward.

marcin-krystianc avatar Nov 05 '25 11:11 marcin-krystianc