redpanda icon indicating copy to clipboard operation
redpanda copied to clipboard

Stray consumer in `ConsumerGroupTest.test_basic_group_join.static_members=False`

Open VladLazar opened this issue 3 years ago • 1 comments

https://buildkite.com/redpanda/redpanda/builds/15526#01835b7c-86e1-4072-b714-3dbeafe0fa45

A KafkaCliConsumer service failed to stop and caused all subsequent tests to fail (on this assertion) as brokers were unexpectedly creating a kafka_internal topic on start-up.

My hunch is that the consumer ignored the SIGTERM signal for some reason, but it's hard to tell since there's no logs for it.

raceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/usr/local/lib/python3.10/dist-packages/ducktape/mark/_mark.py", line 476, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File "/root/tests/rptest/services/cluster.py", line 35, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/consumer_group_test.py", line 140, in test_basic_group_join
    c.wait()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/services/background_thread.py", line 72, in wait
    super(BackgroundThreadService, self).wait(timeout_sec)
  File "/usr/local/lib/python3.10/dist-packages/ducktape/services/service.py", line 267, in wait
    raise TimeoutError("Timed out waiting %s seconds for service nodes to finish. " % str(timeout_sec)
ducktape.errors.TimeoutError: Timed out waiting 600 seconds for service nodes to finish. These nodes are still alive: ['KafkaCliConsumer-0-139938024223840 node 1 on docker-rp-1']

VladLazar avatar Sep 21 '22 13:09 VladLazar

My hunch is that the client ignores the SIGTERM signal for some reason. Maybe we've touched a bug in the graceful shutdown logic. We could simply SIGKILL.

VladLazar avatar Sep 21 '22 14:09 VladLazar

@VladLazar did you want to take this up and help out? If you are busy, then please bubble up in standup/slack thread to see if someone else can.

piyushredpanda avatar Sep 22 '22 00:09 piyushredpanda

@piyushredpanda Yep. That makes sense since I've already spent time looking into it. Assigned to myself.

VladLazar avatar Sep 22 '22 09:09 VladLazar