redpanda
redpanda copied to clipboard
[transaction] Add expire old txs to consumer group serviice
Cover letter
Adding logic to expire old txs on group side. We should do to avoid situations:
- If transaction was commited during consumer group reconfiguration. In this case client request will be retried after 1 minute. We try to do it faster if timeout is small
- during transaction coordinator changing
This pr adds the same logic like rm_stm has to expire old transaction. Group will checks transaction and expire it id needed
Fixes #5543
Backport Required
- [ ] not a bug fix
- [ ] papercut/not impactful enough to backport
- [ ] v22.2.x
- [ ] v22.1.x
- [ ] v21.11.x
UX changes
- none
Release notes
- none
Test for this logic is suites/tests/tx_subscribe/stop_client.json from chaos.
It should not be HANG
Fail in
TRACE 2022-08-11 13:42:59,935 [shard 0] kafka - offset_fetch.cc:45 - Handling request {group_id={test_group} topics={{{name={t} partition_indexes={{0}}}}} require_stable=false}
../../../src/v/pandaproxy/rest/test/consumer_group.cc(201): <span class="term-fg31 term-fg4">fatal error: in "pandaproxy_consumer_group": critical check res.body == R"({"offsets":[{"topic":"t","partition":0,"offset":-1,"metadata":""}]})" has failed [{"offsets":[]} != {"offsets":[{"topic":"t","partition":0,"offset":-1,"metadata":""}]}]</span>
Looks like error in test code
Do we need any test coverage with this patch? especially around expiring txns state on the group side..
As I mentioned before https://github.com/redpanda-data/redpanda/pull/5851#issuecomment-1208032235 To test it we should use chaos test
@bharathv if you are fine with the changes can you please dismiss yours review?
I just have one comment https://github.com/redpanda-data/redpanda/pull/5851/commits/61dd4ada9dec70f95df4f7989bf34d7812ba6220#r951630194 , lgtm once that is fixed.