confluent-kafka-python icon indicating copy to clipboard operation
confluent-kafka-python copied to clipboard

bundle librdkafka binaries for aarch64

Open sibeream opened this issue 2 years ago • 9 comments

Description

docker run -ti python:3.8-slim-buster /bin/bash pip install confluent-kafka==1.8.2

The above commands produce different results on x86_64 and aarch64 platforms.

Operating systems in use:

  • Fedora 34 (Intel i7)
  • macOS 12.3.1 (Apple M1)

For x86_64 confluent-kafka is just installed inside the container, for aarch64 the output is the following:

Collecting confluent-kafka==1.8.2
  Downloading confluent-kafka-1.8.2.tar.gz (104 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 104.6/104.6 KB 6.0 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Building wheels for collected packages: confluent-kafka
  Building wheel for confluent-kafka (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [46 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-aarch64-3.8
      creating build/lib.linux-aarch64-3.8/confluent_kafka
      copying src/confluent_kafka/error.py -> build/lib.linux-aarch64-3.8/confluent_kafka
      copying src/confluent_kafka/serializing_producer.py -> build/lib.linux-aarch64-3.8/confluent_kafka
      copying src/confluent_kafka/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka
      copying src/confluent_kafka/deserializing_consumer.py -> build/lib.linux-aarch64-3.8/confluent_kafka
      creating build/lib.linux-aarch64-3.8/confluent_kafka/admin
      copying src/confluent_kafka/admin/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/admin
      creating build/lib.linux-aarch64-3.8/confluent_kafka/kafkatest
      copying src/confluent_kafka/kafkatest/verifiable_client.py -> build/lib.linux-aarch64-3.8/confluent_kafka/kafkatest
      copying src/confluent_kafka/kafkatest/verifiable_producer.py -> build/lib.linux-aarch64-3.8/confluent_kafka/kafkatest
      copying src/confluent_kafka/kafkatest/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/kafkatest
      copying src/confluent_kafka/kafkatest/verifiable_consumer.py -> build/lib.linux-aarch64-3.8/confluent_kafka/kafkatest
      creating build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/avro.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/schema_registry_client.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/error.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/protobuf.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/json_schema.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      creating build/lib.linux-aarch64-3.8/confluent_kafka/avro
      copying src/confluent_kafka/avro/load.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro
      copying src/confluent_kafka/avro/error.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro
      copying src/confluent_kafka/avro/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro
      copying src/confluent_kafka/avro/cached_schema_registry_client.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro
      creating build/lib.linux-aarch64-3.8/confluent_kafka/serialization
      copying src/confluent_kafka/serialization/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/serialization
      creating build/lib.linux-aarch64-3.8/confluent_kafka/avro/serializer
      copying src/confluent_kafka/avro/serializer/message_serializer.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro/serializer
      copying src/confluent_kafka/avro/serializer/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro/serializer
      running build_ext
      building 'confluent_kafka.cimpl' extension
      creating build/temp.linux-aarch64-3.8
      creating build/temp.linux-aarch64-3.8/tmp
      creating build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7
      creating build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7
      creating build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7/src
      creating build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7/src/confluent_kafka
      creating build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7/src/confluent_kafka/src
      gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/include/python3.8 -c /tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7/src/confluent_kafka/src/confluent_kafka.c -o build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7/src/confluent_kafka/src/confluent_kafka.o
      unable to execute 'gcc': No such file or directory
      error: command 'gcc' failed with exit status 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for confluent-kafka
  Running setup.py clean for confluent-kafka
Failed to build confluent-kafka
Installing collected packages: confluent-kafka
  Running setup.py install for confluent-kafka ... error
  error: subprocess-exited-with-error

  × Running setup.py install for confluent-kafka did not run successfully.
  │ exit code: 1
  ╰─> [46 lines of output]
      running install
      running build
      running build_py
      creating build
      creating build/lib.linux-aarch64-3.8
      creating build/lib.linux-aarch64-3.8/confluent_kafka
      copying src/confluent_kafka/error.py -> build/lib.linux-aarch64-3.8/confluent_kafka
      copying src/confluent_kafka/serializing_producer.py -> build/lib.linux-aarch64-3.8/confluent_kafka
      copying src/confluent_kafka/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka
      copying src/confluent_kafka/deserializing_consumer.py -> build/lib.linux-aarch64-3.8/confluent_kafka
      creating build/lib.linux-aarch64-3.8/confluent_kafka/admin
      copying src/confluent_kafka/admin/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/admin
      creating build/lib.linux-aarch64-3.8/confluent_kafka/kafkatest
      copying src/confluent_kafka/kafkatest/verifiable_client.py -> build/lib.linux-aarch64-3.8/confluent_kafka/kafkatest
      copying src/confluent_kafka/kafkatest/verifiable_producer.py -> build/lib.linux-aarch64-3.8/confluent_kafka/kafkatest
      copying src/confluent_kafka/kafkatest/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/kafkatest
      copying src/confluent_kafka/kafkatest/verifiable_consumer.py -> build/lib.linux-aarch64-3.8/confluent_kafka/kafkatest
      creating build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/avro.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/schema_registry_client.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/error.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/protobuf.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      copying src/confluent_kafka/schema_registry/json_schema.py -> build/lib.linux-aarch64-3.8/confluent_kafka/schema_registry
      creating build/lib.linux-aarch64-3.8/confluent_kafka/avro
      copying src/confluent_kafka/avro/load.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro
      copying src/confluent_kafka/avro/error.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro
      copying src/confluent_kafka/avro/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro
      copying src/confluent_kafka/avro/cached_schema_registry_client.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro
      creating build/lib.linux-aarch64-3.8/confluent_kafka/serialization
      copying src/confluent_kafka/serialization/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/serialization
      creating build/lib.linux-aarch64-3.8/confluent_kafka/avro/serializer
      copying src/confluent_kafka/avro/serializer/message_serializer.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro/serializer
      copying src/confluent_kafka/avro/serializer/__init__.py -> build/lib.linux-aarch64-3.8/confluent_kafka/avro/serializer
      running build_ext
      building 'confluent_kafka.cimpl' extension
      creating build/temp.linux-aarch64-3.8
      creating build/temp.linux-aarch64-3.8/tmp
      creating build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7
      creating build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7
      creating build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7/src
      creating build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7/src/confluent_kafka
      creating build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7/src/confluent_kafka/src
      gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/include/python3.8 -c /tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7/src/confluent_kafka/src/confluent_kafka.c -o build/temp.linux-aarch64-3.8/tmp/pip-install-78fm53p7/confluent-kafka_060c7cd0cc7442439d07b79c45239cd7/src/confluent_kafka/src/confluent_kafka.o
      unable to execute 'gcc': No such file or directory
      error: command 'gcc' failed with exit status 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> confluent-kafka

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

I assume the reason for the discrepancy in the behaviour is that some binaries are bundled for x86_64, but are missing for aarch64. Is it possible to prebuild and bundle those binaries from your side? The discrepancy in deliverables make it far more complicated to support different architectures for users of the package.

How to reproduce

Run the following commands on aarch64 platform (Apple M1 for instance) docker run -ti python:3.8-slim-buster /bin/bash pip install confluent-kafka==1.8.2

Checklist

Please provide the following information:

  • [ ] confluent-kafka-python and librdkafka version (confluent_kafka.version() and confluent_kafka.libversion()):
  • [ ] Apache Kafka broker version:
  • [ ] Client configuration: {...}
  • [x] Operating system: macOS 12.3.1 (Apple M1) and Fedora 34 (Intel i7)
  • [ ] Provide client logs (with 'debug': '..' as necessary)
  • [ ] Provide broker log excerpts
  • [ ] Critical issue

sibeream avatar Apr 21 '22 09:04 sibeream

From the errors, looks it's related to gcc, have you already installed it?

      unable to execute 'gcc': No such file or directory
      error: command 'gcc' failed with exit status 1

jliunyu avatar Apr 21 '22 23:04 jliunyu

I can confirm that gcc is NOT installed neither in x86_64 neither in aarch64 container. However pip was able to install confluent-kafka in x86_64 container and fails in aarch64. I think it's because gcc is never called during the installation in x86_64 unlike aarch64 case. That's exactly the discrepancy I'm talking about. I would expect the same steps to lead to the same resalts, regardless of platform.

My assamption is that some binaries are bundled for x86_64 version of the package in pip and for aarch64 there is an attempt to build them during the installation process.

Possibly related issues for Go version of the package: https://github.com/confluentinc/confluent-kafka-go/issues/684 https://github.com/confluentinc/confluent-kafka-go/issues/591

sibeream avatar Apr 22 '22 08:04 sibeream

Why are there no prebuilt wheels for arm64?

llamahunter avatar Apr 30 '22 01:04 llamahunter

It's not a perfect world. But if you wanna test Docker images in Apple Silicon (M1), you have to compile in ARM64 binaries using multi-stages:

ARG SPARK_IMAGE=spark-base:latest
FROM ${SPARK_IMAGE}
USER root

RUN apt update && apt install -y git

# Need gcc, make and these stuffs to compile
RUN cd /tmp && git clone https://github.com/edenhill/librdkafka.git && \
    cd librdkafka && git checkout tags/v1.9.0 && \
    ./configure && make && make install && \
    cd ../ && rm -rf librdkafka

Then you copy to your last build image before pip install confluent-kafka:

COPY --from=0 /usr/local/lib /usr/local/lib
COPY --from=0 /usr/local/include /usr/local/include
COPY --from=0 /usr/local/share/doc/librdkafka /usr/local/share/doc/librdkafka

RUN pip install confluent-kafka

ignitz avatar Jun 27 '22 23:06 ignitz

hey @sibeream , this is very valid request!

I'm thinking of creating a wrapper for this package that will have both binaries x86_64 and aarch64.

confluent-kafka-binary ?

confiq avatar Jul 28 '22 13:07 confiq

I have built confluent kafka wheels for different platforms (Mac ARM, linux ARM) and made instructions on how to do that in this forked repo Feel free to use https://github.com/sergii-tsymbal-exa/confluent-kafka-python/releases/tag/v1.8.2

sergii-tsymbal-exa avatar Aug 08 '22 08:08 sergii-tsymbal-exa

I have built confluent kafka wheels for different platforms (Mac ARM, linux ARM) and made instructions on how to do that in this forked repo Feel free to use https://github.com/sergii-tsymbal-exa/confluent-kafka-python/releases/tag/v1.8.2

I tested your wheel for aarch64 and getting the following error message on start:

[mportError: librdkafka.so.1: cannot open shared object file: No such file or directory]

b3n3w avatar Aug 24 '22 12:08 b3n3w

@b3n3w thx for noticing I can reproduce this issue. It means that my approach is not fully working, and you still need to set up preconditions, as specified in the build instructions

sergii-tsymbal-exa avatar Aug 25 '22 09:08 sergii-tsymbal-exa

This run successfully on my Apple M1. Could someone confirm if this is useful? (based on @ignitz 's comment)

FROM python:3.8.13-slim-bullseye

RUN apt-get update && \
  apt-get install -y --no-install-recommends gcc git libssl-dev g++ make && \
  cd /tmp && git clone https://github.com/edenhill/librdkafka.git && \
  cd librdkafka && git checkout tags/v1.9.0 && \
  ./configure && make && make install && \
  cd ../ && rm -rf librdkafka

RUN pip install confluent-kafka==1.9.2

armenzg avatar Aug 26 '22 19:08 armenzg

This allows me to install on my M1 Mac:

brew install librdkafka
export C_INCLUDE_PATH=/opt/homebrew/Cellar/librdkafka/1.9.2/include
export LIBRARY_PATH=/opt/homebrew/Cellar/librdkafka/1.9.2/lib
pip install confluent-kafka

nathanwebsterdotme avatar Oct 25 '22 11:10 nathanwebsterdotme

tracking aarch64 binary wheel support in #1439

mhowlett avatar Oct 25 '22 21:10 mhowlett

This run successfully on my Apple M1. Could someone confirm if this is useful? (based on @ignitz 's comment)

FROM python:3.8.13-slim-bullseye

RUN apt-get update && \
  apt-get install -y --no-install-recommends gcc git libssl-dev g++ make && \
  cd /tmp && git clone https://github.com/edenhill/librdkafka.git && \
  cd librdkafka && git checkout tags/v1.9.0 && \
  ./configure && make && make install && \
  cd ../ && rm -rf librdkafka

RUN pip install confluent-kafka==1.9.2

Thanks for these. I am using Apple M1 and It makes the installing working but still i got importError saying importerror: librdkafka.so.1: cannot open shared object file: no such file or directory. I have to add ldconfig like the following to avoid ImportError which makes the container exit, got it from https://github.com/confluentinc/confluent-kafka-python/issues/65#issuecomment-269964346

I added these lines in my docker-compose.yml and it works

RUN apt-get update && \
  apt-get install -y --no-install-recommends gcc git libssl-dev g++ make && \
  cd /tmp && git clone https://github.com/edenhill/librdkafka && \
  cd librdkafka && git checkout tags/v2.0.2 && \
  ./configure && make && make install && \
  ldconfig &&\
  cd ../ && rm -rf librdkafka

RUN pip install confluent-kafka==2.0.2

phoebe20200523 avatar Feb 14 '23 02:02 phoebe20200523

Looks like version 2.2.0-1 of librdkafka-dev is now available in Debian repo for bookworm - https://tracker.debian.org/pkg/librdkafka

And version 1.9.2-1 is available through the snapshot Debian repo - http://snapshot.debian.org/package/librdkafka/1.9.2-1/

misimpso avatar Jul 19 '23 17:07 misimpso