[improve][broker] Supplement schema ledger if schema ledger is lost
Fixes #20414
Master Issue: #https://github.com/apache/pulsar/issues/20414
Motivation
https://github.com/apache/pulsar/issues/17221 describes an environment when multiple bookie copies are corrupted, or a Ledger has been deleted. The loss of schema ledger results in new producers and consumers not even being created and working properly.
So we need a solution that does not just skip the schema with the missing ledger, but actually supplements the broken schema ledger.
Modifications
Add a new method tryCompleteTheLostSchema() in SchemaStorage and SchemaRegistry
CompletableFuture<Long> tryCompleteTheLostSchemaLedger(String key, SchemaVersion version, SchemaData schema);
- get schemalocator from metastore
- Create a new ledger. And write
SchemaStorageFormat.SchemaEntrybuilt withschemaDataandschemaVersion. - update schemalocator to metastore(new ledger id)
Verifying this change
- [x] Make sure that the change passes the CI checks.
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
(or)
This change is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
- Added integration tests for end-to-end deployment with large payloads (10MB)
- Extended integration test for recovery after broker failure
Documentation
- [ ]
doc - [ ]
doc-required - [x]
doc-not-needed - [ ]
doc-complete
Matching PR in forked repository
PR in forked repository: https://github.com/Denovo1998/pulsar/pull/4
@poorbarcode @congbobo184 @codelipenghui
SchemaData and SchemaVersion has moved to the org.apache.pulsar.broker.service.AbstractTopic, rather than save in each producer and consumer. Check out the solution in issue #20414. Is this way okay now?
The pr had no activity for 30 days, mark with Stale label.
Waiting to discuss whether this plan is feasible. I will send an email to discuss it later.
The pr had no activity for 30 days, mark with Stale label.
In the alternative, the implementation is updated. Needs to be discussed.