fix: Prevent StackOverFlowException in SHARED subscription
(If this PR fixes a github issue, please add Fixes #<xyz>.)
Fixes #16074
(or if this PR is one task of a github issue, please add Master Issue: #<xyz> to link to the master issue.)
Master Issue: #
Motivation
Explain here the context, and why you're making that change. What is the problem you're trying to solve.
Modifications
Describe the modifications you've done.
Verifying this change
- [ ] Make sure that the change passes the CI checks.
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
(or)
This change is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
- Added integration tests for end-to-end deployment with large payloads (10MB)
- Extended integration test for recovery after broker failure
Does this pull request potentially affect one of the following parts:
If yes was chosen, please highlight the changes
- Dependencies (does it add or upgrade a dependency): (yes / no)
- The public API: (yes / no)
- The schema: (yes / no / don't know)
- The default values of configurations: (yes / no)
- The wire protocol: (yes / no)
- The rest endpoints: (yes / no)
- The admin cli options: (yes / no)
- Anything that affects deployment: (yes / no / don't know)
Documentation
Check the box below or label this PR directly.
Need to update docs?
-
[ ]
doc-required(Your PR needs to update docs and you will update later) -
[ ]
doc-not-needed(Please explain why) -
[ ]
doc(Your PR contains doc changes) -
[ ]
doc-complete(Docs have been already added)
Could you provide more analysis about how does it happen?
#16074
if depth is exceeded, we dispatch it to another thread,maybe it will fail again in another thread.
maybe we should abort the invoke when reaching up to the max depth?
Could you provide more analysis about how does it happen?
we found this in our production env. when all consumers disconnected, and then connect again, it may occur.
I found a similar fix just now, see https://github.com/apache/pulsar/pull/14121. Could you also share your point on this PR? @eolivelli
Actually I'm a little confused in which case should readMoreEntries be called in BrokerService's executor. /cc @lhotari
@leizhiyuan Please provide a correct documentation label for your PR. Instructions see Pulsar Documentation Label Guide.
The pr had no activity for 30 days, mark with Stale label.
Since it takes a long time and the code differences are too large, I will close this PR.