confluent-kafka-go
confluent-kafka-go copied to clipboard
Segmentation fault when creating a producer after a consumer
Description
When i create a producer after creating a consumer i get segmentation fault crash. This happens only when i config bootstrap servers as hostnames. If i use IP everything works normally. This happens only when i use static linking.
How to reproduce
source code:
package main
import (
"github.com/confluentinc/confluent-kafka-go/kafka"
)
func main() {
consumer, err := kafka.NewConsumer(&kafka.ConfigMap{
"bootstrap.servers": "example1.com:9092",
"group.id": "test",
"auto.offset.reset": "earliest",
"enable.auto.commit": true,
"debug" : "all",
})
if err != nil {
panic(err)
}
println("before creating producer")
producer, err := kafka.NewProducer(&kafka.ConfigMap{
"bootstrap.servers": "example2.com:9092",
"acks": "all",
"debug" : "all",
})
println("after creating producer")
println(producer.String()) // in order to prevent not used error
println(consumer.String()) // in order to prevent not used error
}
build command: go build -a -ldflags "-linkmode external -extldflags -static" main.go
output:
%7|1651503922.240|MEMBERID|rdkafka#consumer-1| [thrd:app]: Group "test": updating member id "(not-set)" -> ""
%7|1651503922.240|WAKEUPFD|rdkafka#consumer-1| [thrd:app]: GroupCoordinator: Enabled low-latency ops queue wake-ups
%7|1651503922.240|BROKER|rdkafka#consumer-1| [thrd:app]: GroupCoordinator: Added new broker with NodeId -1
%7|1651503922.240|BRKMAIN|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: Enter main broker thread
%7|1651503922.240|WAKEUPFD|rdkafka#consumer-1| [thrd:app]: example1.com:9092/bootstrap: Enabled low-latency ops queue wake-ups
%7|1651503922.240|BRKMAIN|rdkafka#consumer-1| [thrd::0/internal]: :0/internal: Enter main broker thread
%7|1651503922.240|BROKER|rdkafka#consumer-1| [thrd:app]: example1.com:9092/bootstrap: Added new broker with NodeId -1
%7|1651503922.240|CGRPSTATE|rdkafka#consumer-1| [thrd:main]: Group "test" changed state init -> query-coord (join-state init)
%7|1651503922.240|INIT|rdkafka#consumer-1| [thrd:app]: librdkafka v1.8.2 (0x10802ff) rdkafka#consumer-1 initialized (builtin.features gzip,snappy,ssl,sasl,regex,lz4,sasl_plain,sasl_scram,plugins,zstd,sasl_oauthbearer, STRIP STATIC_LINKING CC GXX PKGCONFIG INSTALL GNULD LDS LIBDL PLUGINS STATIC_LIB_zlib ZLIB STATIC_LIB_libcrypto STATIC_LIB_libssl SSL STATIC_LIB_libzstd ZSTD HDRHISTOGRAM SYSLOG SNAPPY SOCKEM SASL_SCRAM SASL_OAUTHBEARER CRC32C_HW, debug 0xfffff)
%7|1651503922.240|BROADCAST|rdkafka#consumer-1| [thrd:main]: Broadcasting state change
%7|1651503922.240|CONNECT|rdkafka#consumer-1| [thrd:main]: example1.com:9092/bootstrap: Selected for cluster connection: coordinator query (broker has 0 connection attempt(s))
%7|1651503922.240|CGRPQUERY|rdkafka#consumer-1| [thrd:main]: Group "test": no broker available for coordinator query: intervaled in state query-coord
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: Client configuration:
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: client.software.name = confluent-kafka-go
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: client.software.version = 1.8.2
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: metadata.broker.list = example1.com:9092
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: debug = generic,broker,topic,metadata,feature,queue,msg,protocol,cgrp,security,fetch,interceptor,plugin,consumer,admin,eos,mock,assignor,conf,all
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: enabled_events = 376
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: default_topic_conf = 0x12d85c0
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: group.id = test
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: enable.auto.commit = true
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: Default topic configuration:
%7|1651503922.240|CONF|rdkafka#consumer-1| [thrd:app]: auto.offset.reset = smallest
%7|1651503922.240|BRKMAIN|rdkafka#consumer-1| [thrd:example1.com:9092/bootstrap]: example1.com:9092/bootstrap: Enter main broker thread
before creating producer
%7|1651503922.240|CONNECT|rdkafka#consumer-1| [thrd:example1.com:9092/bootstrap]: example1.com:9092/bootstrap: Received CONNECT op
%7|1651503922.240|STATE|rdkafka#consumer-1| [thrd:example1.com:9092/bootstrap]: example1.com:9092/bootstrap: Broker changed state INIT -> TRY_CONNECT
%7|1651503922.240|BROADCAST|rdkafka#consumer-1| [thrd:example1.com:9092/bootstrap]: Broadcasting state change
%7|1651503922.240|CONNECT|rdkafka#consumer-1| [thrd:example1.com:9092/bootstrap]: example1.com:9092/bootstrap: broker in state TRY_CONNECT connecting
%7|1651503922.240|STATE|rdkafka#consumer-1| [thrd:example1.com:9092/bootstrap]: example1.com:9092/bootstrap: Broker changed state TRY_CONNECT -> CONNECT
%7|1651503922.240|BROADCAST|rdkafka#consumer-1| [thrd:example1.com:9092/bootstrap]: Broadcasting state change
%7|1651503922.240|WAKEUPFD|rdkafka#producer-2| [thrd:app]: example2.com:9092/bootstrap: Enabled low-latency ops queue wake-ups
%7|1651503922.240|BRKMAIN|rdkafka#producer-2| [thrd::0/internal]: :0/internal: Enter main broker thread
%7|1651503922.240|BROKER|rdkafka#producer-2| [thrd:app]: example2.com:9092/bootstrap: Added new broker with NodeId -1
%7|1651503922.240|CONNECT|rdkafka#producer-2| [thrd:app]: example2.com:9092/bootstrap: Selected for cluster connection: bootstrap servers added (broker has 0 connection attempt(s))
%7|1651503922.240|INIT|rdkafka#producer-2| [thrd:app]: librdkafka v1.8.2 (0x10802ff) rdkafka#producer-2 initialized (builtin.features gzip,snappy,ssl,sasl,regex,lz4,sasl_plain,sasl_scram,plugins,zstd,sasl_oauthbearer, STRIP STATIC_LINKING CC GXX PKGCONFIG INSTALL GNULD LDS LIBDL PLUGINS STATIC_LIB_zlib ZLIB STATIC_LIB_libcrypto STATIC_LIB_libssl SSL STATIC_LIB_libzstd ZSTD HDRHISTOGRAM SYSLOG SNAPPY SOCKEM SASL_SCRAM SASL_OAUTHBEARER CRC32C_HW, debug 0xfffff)
%7|1651503922.240|CONF|rdkafka#producer-2| [thrd:app]: Client configuration:
%7|1651503922.240|CONF|rdkafka#producer-2| [thrd:app]: client.software.name = confluent-kafka-go
%7|1651503922.240|CONF|rdkafka#producer-2| [thrd:app]: client.software.version = 1.8.2
%7|1651503922.240|CONF|rdkafka#producer-2| [thrd:app]: metadata.broker.list = example2.com:9092
%7|1651503922.240|CONF|rdkafka#producer-2| [thrd:app]: debug = generic,broker,topic,metadata,feature,queue,msg,protocol,cgrp,security,fetch,interceptor,plugin,consumer,admin,eos,mock,assignor,conf,all
%7|1651503922.240|CONF|rdkafka#producer-2| [thrd:app]: enabled_events = 329
%7|1651503922.240|CONF|rdkafka#producer-2| [thrd:app]: default_topic_conf = 0x12e3d70
%7|1651503922.240|BRKMAIN|rdkafka#producer-2| [thrd:example2.com:9092/bootstrap]: example2.com:9092/bootstrap: Enter main broker thread
%7|1651503922.240|CONF|rdkafka#producer-2| [thrd:app]: Default topic configuration:
%7|1651503922.240|CONF|rdkafka#producer-2| [thrd:app]: request.required.acks = -1
%7|1651503922.240|CONNECT|rdkafka#producer-2| [thrd:example2.com:9092/bootstrap]: example2.com:9092/bootstrap: Received CONNECT op
%7|1651503922.240|STATE|rdkafka#producer-2| [thrd:example2.com:9092/bootstrap]: example2.com:9092/bootstrap: Broker changed state INIT -> TRY_CONNECT
%7|1651503922.240|BROADCAST|rdkafka#producer-2| [thrd:example2.com:9092/bootstrap]: Broadcasting state change
%7|1651503922.240|CONNECT|rdkafka#producer-2| [thrd:example2.com:9092/bootstrap]: example2.com:9092/bootstrap: broker in state TRY_CONNECT connecting
%7|1651503922.240|STATE|rdkafka#producer-2| [thrd:example2.com:9092/bootstrap]: example2.com:9092/bootstrap: Broker changed state TRY_CONNECT -> CONNECT
%7|1651503922.240|BROADCAST|rdkafka#producer-2| [thrd:example2.com:9092/bootstrap]: Broadcasting state change
Segmentation fault (core dumped)
As you see, "after creating producer" is never printed.
Checklist
Please provide the following information:
- [x] confluent-kafka-go and librdkafka version (
LibraryVersion()
): 1.8.2 - [ ] Apache Kafka broker version:
- [x] Client configuration:
ConfigMap{...}
- [x] Operating system: Ubuntu 20.04.2 LTS
- [x] Provide client logs (with
"debug": ".."
as necessary) - [ ] Provide broker log excerpts
- [ ] Critical issue
Yeah, this seems to be related to the resolver:\
==1695972== Thread 14 rdk:broker-1:
==1695972== Invalid read of size 1
==1695972== at 0x2FD215CA: internal_getent (files-XXX.c:173)
==1695972== by 0x2FD229F3: _nss_files_gethostbyname4_r (files-hosts.c:400)
==1695972== by 0x9841EE: gaih_inet.constprop.0 (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== by 0x985D38: getaddrinfo (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== by 0x59BE33: rd_getaddrinfo (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== by 0x54C81D: rd_kafka_broker_thread_main (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== by 0x4F2475: _thrd_wrapper_function (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== by 0x8EB6F8: start_thread (pthread_create.c:477)
==1695972== by 0x98B392: clone (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== Address 0x63 is not stack'd, malloc'd or (recently) free'd
==1695972==
==1695972==
==1695972== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==1695972== Access not within mapped region at address 0x63
==1695972== at 0x2FD215CA: internal_getent (files-XXX.c:173)
==1695972== by 0x2FD229F3: _nss_files_gethostbyname4_r (files-hosts.c:400)
==1695972== by 0x9841EE: gaih_inet.constprop.0 (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== by 0x985D38: getaddrinfo (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== by 0x59BE33: rd_getaddrinfo (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== by 0x54C81D: rd_kafka_broker_thread_main (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== by 0x4F2475: _thrd_wrapper_function (in /home/maglun/gocode/src/testar/issue-774/main)
==1695972== by 0x8EB6F8: start_thread (pthread_create.c:477)
==1695972== by 0x98B392: clone (in /home/maglun/gocode/src/testar/issue-774/main)
It seems that with your extra build flags it won't dynamically link to libc, et.al.
Standard static go build
:
$ ldd issue-774
linux-vdso.so.1 (0x00007ffde157b000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f4a69f17000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f4a69f11000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f4a69eee000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4a69cfc000)
/lib64/ld-linux-x86-64.so.2 (0x00007f4a6a0ab000)
With go build -a -ldflags "-linkmode external -extldflags -static" main.go
:
$ ldd main
not a dynamic executable
Thank you for your attention. So you mean that for this usage, it should be dynamically linked to libraries?
Yep, if you want to use glibc, you have to use dynamic linking, for functions like getaddrinfo which have to be linked dynamically.
If you are okay with using musl, you can achieve static linking as well
The details are here: https://github.com/confluentinc/confluent-kafka-go#static-builds-on-linux