pulsar
pulsar copied to clipboard
[Bug][broker] cursor will read in dead loop when do tailing-read with enableTransaction
Search before asking
- [X] I searched in the issues and found nothing similar.
Read release policy
- [X] I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.
Version
client: pulsar-3.0.5 broker: pulsar-3.0.5
Minimal reproduce step
do txn produce and normal consume on a 200-partition topic by pulsar-perf. The throughput is 10MB/s, batchSize is 10, subscriptionType is exclusive. It is a tailing read, consuming the latest message
produce config is : -txn -nmt 1000 -time 0 -s 1024 -i 60 -bm 10 -b 1000 -bb 4194304 -r 10000 -mk random -threads 3
consume config is : -time 0 -i 60 -s sub_test_txn_p200 -ss sub_test_txn_p200 -sp Latest -ioThreads 1 -n 1
What did you expect to see?
cpu load is low
What did you see instead?
broker with little throughput but high cpu load
Anything else?
This issue is proposed before but actually the issue still exist in the master branch . And it is a serious issue that result in transaction unavailable.
The root is :
In ManagedCursorImpl#asyncReadEntriesWithSkipOrWait, hasMoreEntries() only compare readPosition and lastConfirmedEntry. However, if we enableTransaction, maxReadPosition also decide whether we can read entry.
Currently, if readPosition < lastConfirmedEntry && readPosition > maxReadPosition. We can read entry immediately. But when enter internalReadFromLedger(), we will go into opReadEntry.checkReadCompletion(), and then trigger callback.readEntriesComplete()
Therefore, it would continue to read entry in dead loop, but actually there is no need to read entry.
https://github.com/apache/pulsar/blob/5dc030431a60b49e81d577cd06a1ae63dbee0293/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java#L934-L979
https://github.com/apache/pulsar/blob/5dc030431a60b49e81d577cd06a1ae63dbee0293/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L2051-L2056
https://github.com/apache/pulsar/blob/5dc030431a60b49e81d577cd06a1ae63dbee0293/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java#L164-L186
Are you willing to submit a PR?
- [X] I'm willing to submit a PR!