parallel-consumer
parallel-consumer copied to clipboard
ProducerManager should handle different types of transaction failures appropriately
ProducerManager#commitOffsets handles all TX commit failures the same - it just retries. This is wishful thinking, and we should treat each failure properly - either by giving up faster, shutting down, or retrying. This probably hasn't been noticed yet as transaction commit mode isn't popular - people just use the consumer commit modes.
The worst this issue should do, is cause the Producer to keep retrying much longer than it should, before crashing. But it's a tight loop so shouldn't take too long to crash. Upon being restarted by process monitoring, messages not committed will be retried, and things should continue correctly.
Throws:
java.lang.IllegalStateException - if no transactional.id has been configured or no transaction has been started
ProducerFencedException - fatal error indicating another producer with the same transactional.id is active
UnsupportedVersionException - fatal error indicating the broker does not support transactions (i.e. if its version is lower than 0.11.0.0)
AuthorizationException - fatal error indicating that the configured transactional.id is not authorized. See the exception for more details
KafkaException - if the producer has encountered a previous fatal or abortable error, or for any other unexpected error
TimeoutException - if the time taken for committing the transaction has surpassed max.block.ms.
InterruptException - if the thread is interrupted while blocked
Raised in question https://github.com/confluentinc/parallel-consumer/discussions/112
Pushed some ideas on this: https://github.com/confluentinc/parallel-consumer/compare/master...astubbs:tx-commit-failure Needs a test adding to capture some corner cases. Not a high priority as producer commit mode isn't popular, and the system will recover, as it will quickly crash if it fails too many times and get restarted - after a restart there shouldn't be any issue.
Closing Issue