pulsar icon indicating copy to clipboard operation
pulsar copied to clipboard

[Bug][broker] cursor will read in dead loop when do tailing-read with enableTransaction

Open TakaHiR07 opened this issue 8 months ago • 0 comments

Search before asking

  • [X] I searched in the issues and found nothing similar.

Read release policy

  • [X] I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.

Version

client: pulsar-3.0.5 broker: pulsar-3.0.5

Minimal reproduce step

do txn produce and normal consume on a 200-partition topic by pulsar-perf. The throughput is 10MB/s, batchSize is 10, subscriptionType is exclusive. It is a tailing read, consuming the latest message

produce config is : -txn -nmt 1000 -time 0 -s 1024 -i 60 -bm 10 -b 1000 -bb 4194304 -r 10000 -mk random -threads 3

consume config is : -time 0 -i 60 -s sub_test_txn_p200 -ss sub_test_txn_p200 -sp Latest -ioThreads 1 -n 1

What did you expect to see?

cpu load is low

What did you see instead?

broker with little throughput but high cpu load

image image

Anything else?

This issue is proposed before but actually the issue still exist in the master branch . And it is a serious issue that result in transaction unavailable.

The root is :

In ManagedCursorImpl#asyncReadEntriesWithSkipOrWait, hasMoreEntries() only compare readPosition and lastConfirmedEntry. However, if we enableTransaction, maxReadPosition also decide whether we can read entry.

Currently, if readPosition < lastConfirmedEntry && readPosition > maxReadPosition. We can read entry immediately. But when enter internalReadFromLedger(), we will go into opReadEntry.checkReadCompletion(), and then trigger callback.readEntriesComplete()

Therefore, it would continue to read entry in dead loop, but actually there is no need to read entry.

https://github.com/apache/pulsar/blob/5dc030431a60b49e81d577cd06a1ae63dbee0293/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java#L934-L979

https://github.com/apache/pulsar/blob/5dc030431a60b49e81d577cd06a1ae63dbee0293/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L2051-L2056

https://github.com/apache/pulsar/blob/5dc030431a60b49e81d577cd06a1ae63dbee0293/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java#L164-L186

Are you willing to submit a PR?

  • [X] I'm willing to submit a PR!

TakaHiR07 avatar Jun 19 '24 12:06 TakaHiR07