pulsar icon indicating copy to clipboard operation
pulsar copied to clipboard

[improve][broker] Optimize seeking by timestamp

Open dao-jun opened this issue 1 year ago • 6 comments

Fixes https://github.com/apache/pulsar/issues/22129

Motivation

Pulsar uses binary search to find the message by timestamp, it will reduce the number of iterations to find the message, and make it more efficient and faster.

Even though the current implementation is correct, and using binary search to speed-up, but it's still not efficient enough. The current implementation is to scan all the ledgers to find the message by timestamp. This is a performance bottleneck, especially for large topics with many messages. Say, if there is a topic which has 1m entries, through the binary search, it will take 20 iterations to find the message.

In some extreme cases, it may lead to a timeout, and the client will not be able to seeking by timestamp.

The motivation of this PR is to optimize the finding message by timestamp, to make it more efficient and faster.

Modifications

Before search entires, calculate the start, end position by LedgerInfo#timestamp and only search entries in the range to avoid search the entire ledgers.

Verifying this change

  • [ ] Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • [ ] Dependencies (add or upgrade a dependency)
  • [ ] The public API
  • [ ] The schema
  • [ ] The default values of configurations
  • [ ] The threading model
  • [ ] The binary protocol
  • [ ] The REST endpoints
  • [ ] The admin CLI options
  • [ ] The metrics
  • [ ] Anything that affects deployment

Documentation

  • [ ] doc
  • [ ] doc-required
  • [x] doc-not-needed
  • [ ] doc-complete

Matching PR in forked repository

PR in forked repository:

dao-jun avatar May 28 '24 15:05 dao-jun

Codecov Report

Attention: Patch coverage is 77.77778% with 8 lines in your changes missing coverage. Please review.

Project coverage is 75.66%. Comparing base (bbc6224) to head (512fd0d). Report is 835 commits behind head on master.

Files with missing lines Patch % Lines
...che/bookkeeper/mledger/impl/ManagedCursorImpl.java 75.00% 3 Missing and 2 partials :warning:
...er/service/persistent/PersistentMessageFinder.java 81.25% 1 Missing and 2 partials :warning:
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #22792      +/-   ##
============================================
+ Coverage     73.57%   75.66%   +2.09%     
+ Complexity    32624     5895   -26729     
============================================
  Files          1877     1961      +84     
  Lines        139502   160677   +21175     
  Branches      15299    19557    +4258     
============================================
+ Hits         102638   121583   +18945     
- Misses        28908    30314    +1406     
- Partials       7956     8780     +824     
Flag Coverage Δ
inttests 29.57% <0.00%> (+4.99%) :arrow_up:
systests 25.62% <0.00%> (+1.30%) :arrow_up:
unittests 75.37% <77.77%> (+2.53%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...a/org/apache/bookkeeper/mledger/ManagedCursor.java 31.57% <ø> (-3.72%) :arrow_down:
...er/service/persistent/PersistentMessageFinder.java 70.49% <81.25%> (+4.58%) :arrow_up:
...che/bookkeeper/mledger/impl/ManagedCursorImpl.java 82.03% <75.00%> (+2.74%) :arrow_up:

... and 703 files with indirect coverage changes

codecov-commenter avatar May 28 '24 18:05 codecov-commenter

Can I help merge this PR?

KannarFr avatar Jul 17 '24 13:07 KannarFr

Can I help merge this PR?

This PR needs more review

dao-jun avatar Jul 18 '24 02:07 dao-jun

@dao-jun please rebase

lhotari avatar Oct 14 '24 11:10 lhotari

@lhotari Rebased, please help review when you have time

dao-jun avatar Oct 17 '24 13:10 dao-jun

Closed and reopened to get the change to unblock CI.

lhotari avatar Oct 17 '24 14:10 lhotari

Hello folks, do you have any news about this pull request? We hit this issue on production and we (Clever Cloud) could help to test if necessary.

FlorentinDUBOIS avatar Nov 05 '24 10:11 FlorentinDUBOIS

close reopen to trigger the CI checks

dao-jun avatar Nov 07 '24 06:11 dao-jun

Closing and reopening to trigger a new CI run with latest changes from master branch.

lhotari avatar Jan 08 '25 14:01 lhotari

@dao-jun I pushed 914b2ca to this PR to avoid breaking the ManagedCursor interface. This way we can make this change to Pulsar 4.0 without needing a PIP.

lhotari avatar Jan 09 '25 05:01 lhotari

@dao-jun I pushed 914b2ca to this PR to avoid breaking the ManagedCursor interface. This way we can make this change to Pulsar 4.0 without needing a PIP.

Thanks

dao-jun avatar Jan 09 '25 07:01 dao-jun

@dao-jun To speed up the review, I pushed the changes for adding a configuration parameter managedLedgerCursorResetLedgerCloseTimestampMaxClockSkewMillis and logic to handle it. That was something that I was concerned about. Do the changes look ok to you?

    @FieldContext(
            category = CATEGORY_STORAGE_ML,
            dynamic = true,
            doc = "When resetting a subscription by timestamp, the broker will use the"
                    + " ledger closing timestamp metadata to determine the range of ledgers"
                    + " to search for the message where the subscription position is reset to. "
                    + " Since by default, the search condition is based on the message publish time provided by the "
                    + " client at the publish time, there will be some clock skew between the ledger closing timestamp "
                    + " metadata and the publish time."
                    + " This configuration is used to set the max clock skew between the ledger closing"
                    + " timestamp and the message publish time for finding the range of ledgers to open for searching."
                    + " The default value is 60000 milliseconds (60 seconds). When set to -1, the broker will not"
                    + " use the ledger closing timestamp metadata to determine the range of ledgers to search for the"
                    + " message."
    )
    private int managedLedgerCursorResetLedgerCloseTimestampMaxClockSkewMillis = 60000;

lhotari avatar Jan 09 '25 08:01 lhotari

@dao-jun To speed up the review, I pushed the changes for adding a configuration parameter managedLedgerCursorResetLedgerCloseTimestampMaxClockSkewMillis and logic to handle it. That was something that I was concerned about. Do the changes look ok to you?

    @FieldContext(
            category = CATEGORY_STORAGE_ML,
            dynamic = true,
            doc = "When resetting a subscription by timestamp, the broker will use the"
                    + " ledger closing timestamp metadata to determine the range of ledgers"
                    + " to search for the message where the subscription position is reset to. "
                    + " Since by default, the search condition is based on the message publish time provided by the "
                    + " client at the publish time, there will be some clock skew between the ledger closing timestamp "
                    + " metadata and the publish time."
                    + " This configuration is used to set the max clock skew between the ledger closing"
                    + " timestamp and the message publish time for finding the range of ledgers to open for searching."
                    + " The default value is 60000 milliseconds (60 seconds). When set to -1, the broker will not"
                    + " use the ledger closing timestamp metadata to determine the range of ledgers to search for the"
                    + " message."
    )
    private int managedLedgerCursorResetLedgerCloseTimestampMaxClockSkewMillis = 60000;

LGTM

dao-jun avatar Jan 09 '25 10:01 dao-jun

Thanks a lot for the work done here @dao-jun @lhotari!

KannarFr avatar Jan 09 '25 13:01 KannarFr

@dao-jun There's a bug report that might be related to this change. Do you have a chance to check #23910? Thanks

lhotari avatar Jan 29 '25 23:01 lhotari

@dao-jun There's a bug report that might be related to this change. Do you have a chance to check #23910? Thanks

Will do it after a few days~

dao-jun avatar Jan 30 '25 09:01 dao-jun

https://lists.apache.org/thread/jp1rpjqz9cczsro3b36t15z239zqfqmy Cherry-picked

poorbarcode avatar May 07 '25 02:05 poorbarcode