librdkafka icon indicating copy to clipboard operation
librdkafka copied to clipboard

apache doris routine load Using librdkafka client to consume kafka data causes backend nodes (mainly data storage and computation) to stop

Open hf200012 opened this issue 2 years ago • 2 comments

Read the FAQ first: https://github.com/edenhill/librdkafka/wiki/FAQ

Do NOT create issues for questions, use the discussion forum: https://github.com/edenhill/librdkafka/discussions

Description

apache doris routine load Using the librdkafka client to consume kafka data causes the backend node (mainly data storage and computation) to stop, with the following stack information

*** rdkafka_cgrp.c:2680:rd_kafka_cgrp_terminated: assert: !rd_kafka_assignment_in_progress(rkcg->rkcg_rk) ***
*** Aborted at 1665964473 (unix time) try "date -d @1665964473" if you are using GNU date ***
*** SIGABRT unkown detail explain (@0x4d96) received by PID 19862 (TID 0x7f00fb0db700) from PID 19862; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/disk2/apache-doris/be/src/common/signal_handler.h:420
 1# 0x00007F01FA93A400 in /lib64/libc.so.6
 2# gsignal in /lib64/libc.so.6
 3# abort in /lib64/libc.so.6
 4# 0x0000000004EC95F1 in /ssd/app/doris-1.1.1/be/lib/doris_be
 5# 0x0000000004F23C2A in /ssd/app/doris-1.1.1/be/lib/doris_be
 6# rd_kafka_cgrp_serve in /ssd/app/doris-1.1.1/be/lib/doris_be
 7# 0x0000000004F1F2C1 in /ssd/app/doris-1.1.1/be/lib/doris_be
 8# rd_kafka_buf_callback in /ssd/app/doris-1.1.1/be/lib/doris_be
 9# rd_kafka_op_handle_std in /ssd/app/doris-1.1.1/be/lib/doris_be
10# rd_kafka_op_handle in /ssd/app/doris-1.1.1/be/lib/doris_be
11# rd_kafka_q_serve in /ssd/app/doris-1.1.1/be/lib/doris_be
12# 0x0000000004ECB239 in /ssd/app/doris-1.1.1/be/lib/doris_be
13# 0x0000000004F4ADD6 in /ssd/app/doris-1.1.1/be/lib/doris_be
14# start_thread in /lib64/libpthread.so.0
15# clone in /lib64/libc.so.6

Would love for you to get some help from your side How to reproduce

<your steps how to reproduce goes here, or remove section if not relevant>

IMPORTANT: Always try to reproduce the issue on the latest released version (see https://github.com/edenhill/librdkafka/releases), if it can't be reproduced on the latest version the issue has been fixed.

Checklist

IMPORTANT: We will close issues where the checklist has not been completed.

Please provide the following information:

  • [x] librdkafka version (release number or git tag): 1.8.2
  • [x] Apache Kafka version: 2.13-2.7.1
  • [x] librdkafka client configuration: default
  • [x] Operating system: centos 7.x
  • [ ] Provide logs (with debug=.. as necessary) from librdkafka
  • [ ] Provide broker log excerpts
  • [ ] Critical issue

hf200012 avatar Oct 19 '22 05:10 hf200012

I see similar issue randomly occuring for my CI: https://github.com/karafka/karafka/actions/runs/3276318559/jobs/5392240376 maybe that will help somehow.

mensfeld avatar Oct 19 '22 07:10 mensfeld

Please try to reproduce on the latest librdkafka version

edenhill avatar Nov 17 '22 14:11 edenhill

Still happening once in a while on 2.0.2

ref https://github.com/karafka/karafka/actions/runs/4687378215/jobs/8306612067#step:9:258

mensfeld avatar Apr 13 '23 09:04 mensfeld

another one with

 rdkafka_cgrp.c:2612:rd_kafka_cgrp_terminated: assert: !rd_kafka_assignment_in_progress(rkcg->rkcg_rk) ***

https://github.com/karafka/karafka/actions/runs/5884038699/job/15957968275?pr=1550 (run 1)

mensfeld avatar Aug 17 '23 07:08 mensfeld