Andrew Wong
Andrew Wong
Yeah that's a good observation. While we are probably no longer hanging in the test thread, it seems reasonable to expect something like #5990 coming in from the verifier threads...
John's kgo-verifier rework has landed https://github.com/redpanda-data/redpanda/pull/6059 so we shouldn't see these hangs anymore. This hasn't been reported in a couple weeks which is a good sign.
Seen again in both the big and many partitions cases FAIL test: ConsumerGroupTest.test_basic_group_join.static_members=False (1/24 runs) failure at 2022-08-17T07:38:31.224Z: TimeoutError('') in job https://buildkite.com/redpanda/vtools/builds/3271#0182a9c3-a7aa-4850-a6b7-65bee8152d80 stack trace: ``` ==================================================================================================== test_id: rptest.tests.consumer_group_test.ConsumerGroupTest.test_basic_group_join.static_members=False status: FAIL...
This smells fishy. The test isn't doing anything that takes remotely close to 10 minutes to finish.
Looking at some other debug test runs, the test typically completes in 3-4 minutes, and the default timeout for a service wait is 10 minutes, so I'm skeptical this is...
Seen again here: FAIL test: ConsumerOffsetsMigrationTest.test_migrating_consume_offsets.failures=False.cpus=1 (1/24 runs) failure at 2022-08-17T10:53:32.899Z: TimeoutError('Redpanda node docker-rp-5 failed to stop in 60 seconds') in job https://buildkite.com/redpanda/redpanda/builds/14253#0182aaf0-b6fb-4499-a1df-8288ad33b894 Stack trace ``` ==================================================================================================== test_id: rptest.tests.consumer_offsets_migration_test.ConsumerOffsetsMigrationTest.test_migrating_consume_offsets.failures=False.cpus=1 status:...
Seen again in https://buildkite.com/redpanda/redpanda/builds/16208#0183a9c9-4cb9-4911-a107-536c79f8f3d5 https://ci-artifacts.dev.vectorized.cloud/redpanda/0183a9c9-4cb9-4911-a107-536c79f8f3d5/vbuild/ducktape/results/2022-10-05--001/report.html ```TRACE 2022-10-05 21:21:59,816 [shard 0] kafka - request_context.h:168 - [172.16.16.27:36196] sending 3:metadata for {rdkafka}, response {throttle_time_ms=0 brokers={{node_id=3 host=docker-rp-8 port=9092 rack={nullopt}}, {node_id=1 host=docker-rp-6 port=9092 rack={nullopt}}, {node_id=2...
It looks like the transaction test is failing now: ``` ==================================================================================================== test_id: rptest.tests.transactions_test.MixedVersionTransactionsTest.test_txn_rpcs_with_upgrade status: FAIL run time: 2 minutes 10.358 seconds KafkaException(KafkaError{code=INVALID_TXN_STATE,val=48,str="Failed to initialize Producer ID: Broker: Producer attempted a...
``` 73 bytes)}, writer=nullptr, cache=nullptr, compaction_index:nullopt, closed=0, tombstone=0, index={file:test.dir_1712765242/redpanda/kvstore/0_0/0-0-v1.base_index, offsets:0, index:{header_bitflags:0, base_offset:0, max_offset:38, base_timestamp:{timestamp: 1712765243573}, max_timestamp:{timestamp: 1712765243842}, batch_timestamps_are_monotonic:1, with_offset:false, non_data_timestamps:0, broker_timestamp:{{timestamp: 1712765243842}}, num_compactible_records_appended:{39}, index(1,1,1)}, step:32768, needs_persistence:0}} _bk;t=1712765258285unknown file: Failure _bk;t=1712765258285C++...
@amnonh Hello! Gentle nudge about this when you get the chance. I'm happy to answer questions and provide additional context if needed. Alternatively, I could tag the reviewers from https://github.com/scylladb/seastar/pull/1741...