[improve][pip] PIP-414: Enforce topic consistency check
Motivation
The changes from #24118 have been merged and should be documented in a Pulsar Improvement Proposal (PIP) to record these important updates.
Documentation
- [ ]
doc - [ ]
doc-required - [x]
doc-not-needed - [ ]
doc-complete
Another approach is to create the missed partitioned metadata automatically. For example, with this PIP implemented, when the broker detects my-topic-partition-0 is accessed by a client but there is no partition metadata for my-topic. The broker should try creating the partition metadata to unblock the client. The number of partitions are determined by the largest partition index N in my-topic-partition-N.
At least we need to recover the partition metadata for such orphan partitions. Currently it's impossible:
> admin topics create-partitioned-topic my-topic -p 1 2025-04-25T19:45:14,373+0800 [AsyncHttpClient-7-2] WARN org.apache.pulsar.client.admin.internal.BaseResource - [http://localhost:8080/admin/v2/persistent/public/default/my-topic/partitions?createLocalTopicOnly=false] Failed to perform http put request: javax.ws.rs.ClientErrorException: HTTP 409 {"reason":"This topic already exists"} This topic already exists
Good catch, I can fix this bug. I think this command helps the user to recover the partitioned metadata.
Another approach is to create the missed partitioned metadata automatically.
This is difficult because we can not obtain the maximum partition index.
This is difficult because we can not obtain the maximum partition index.
We can list all non-partitioned topics of the <topic>-partition-<index> pattern and find the maximum index.
This is difficult because we can not obtain the maximum partition index.
We can list all non-partitioned topics of the
<topic>-partition-<index>pattern and find the maximumindex.
@BewareMyPower Do this during topic creation (e.g., via admin topics create-partitioned-topic) or at runtime?
Another approach is to create the missed partitioned metadata automatically. For example, with this PIP implemented, when the broker detects
my-topic-partition-0is accessed by a client but there is no partition metadata formy-topic. The broker should try creating the partition metadata to unblock the client. The number of partitions are determined by the largest partition indexNinmy-topic-partition-N.
I agree with this plan. Our cluster has been running for many years, so there are likely quite a few topics with metadata errors.
Perhaps we should provide an additional admin interface to query topics with metadata errors.
I agree with this plan. Our cluster has been running for many years, so there are likely quite a few topics with metadata errors.
@crossoverJie If the broker automatically creates the partitioned metadata, this may break the user behavior.
Perhaps we should provide an additional admin interface to query topics with metadata errors.
If the topic type is incorrect, the user can use the create partitioned topic command to fix the metadata error.
Do this during topic creation (e.g., via admin topics create-partitioned-topic) or at runtime?
Topic creation, or BrokerService#getTopic on a non-partitioned topic whose name ends with -partition-<index>.
Do this during topic creation (e.g., via admin topics create-partitioned-topic) or at runtime?
Topic creation, or
BrokerService#getTopicon a non-partitioned topic whose name ends with-partition-<index>.
@BewareMyPower Topic creation is the best place to handle this. Doing it at runtime (e.g., in BrokerService#getTopic) would be too late and would introduce black-box behavior that's harder to reason about and debug.
I don't think create missing topic partition metadata in runtime is OK, as @nodece says, it would introduce unpredictable consequences. Actually, I think add a new config is the relatively good way.
Then support creating missed partitions via admin API for such cases could be the solution. Currently the create-missed-partitions API does not work.
I don't think create missing topic partition metadata in runtime is OK, as @nodece says, it would introduce unpredictable consequences. Actually, I think add a new config is the relatively good way.
@dao-jun Instead of introducing a flag to enable or disable the creation of missing topic partition metadata at runtime, this feature will always be enabled. Since disabling this feature can lead to unexpected behavior and complexity, enabling it unconditionally simplifies the design and avoids confusion.
Then support creating missed partitions via admin API for such cases could be the solution. Currently the create-missed-partitions API does not work.
@BewareMyPower admin topics create-partitioned-topic my-topic -p 1 has been fixed by #24225
Could you review this PIP again?