pulsar icon indicating copy to clipboard operation
pulsar copied to clipboard

[broker] Revert PR 6404:"Consumer received duplicated deplayed messages upon restart"

Open dao-jun opened this issue 1 year ago • 2 comments

Motivation

https://github.com/apache/pulsar/pull/6404 is to fix duplicated messages after broker restart(Especially for delayed messages). However, since we introduced https://github.com/apache/pulsar/pull/19035, it has to some extent solved this problem, but this logic still exists, in some cases, it may lead to more serious duplication problems.

https://github.com/apache/pulsar/pull/6404 wants to skip read entries which are to replay. Say, if entry [1] is a delay message, and the current readPosition is [1], it can move the readPosition to [2] to avoid normal read operations read the entry to resolve message duplication.

However, it may have a chance to move the Cursor#readPosition backward:

replay read normal read
step 1 read entry 1 read entries [1, 10]
step 2 the readPosition equals 1
step 3 read [1, 10] complete
step 4 update readPosition to 11
step 5 update readPosition to 2

Since it moved the readPosition backward to 2, so even though broker read [1, 10] from BK and dispatched them to client, but it will also read [2, 11] from BK then dispatch in the next reading, [2, 10] will be duplicated.

Modifications

Verifying this change

  • [ ] Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • [ ] Dependencies (add or upgrade a dependency)
  • [ ] The public API
  • [ ] The schema
  • [ ] The default values of configurations
  • [ ] The threading model
  • [ ] The binary protocol
  • [ ] The REST endpoints
  • [ ] The admin CLI options
  • [ ] The metrics
  • [ ] Anything that affects deployment

Documentation

  • [ ] doc
  • [ ] doc-required
  • [x] doc-not-needed
  • [ ] doc-complete

Matching PR in forked repository

PR in forked repository:

dao-jun avatar May 30 '24 05:05 dao-jun

but this logic still exists, in some cases, it may lead to more serious duplication problems.

please share more details. Do you have a chance to add a test case for this PR?

lhotari avatar May 30 '24 23:05 lhotari

@lhotari I updated the description

dao-jun avatar May 31 '24 01:05 dao-jun