[Bug] Synchronization using geo's namespace-level replicas Message lost
Search before reporting
- [x] I searched in the issues and found nothing similar.
Read release policy
- [x] I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.
User environment
pulsar version 3.0.11
Issue Description
Currently, there are two clusters that have set up geo namespace replica synchronization, but in one of the clusters, it is found that the messages synchronized from the cluster are far different from the main cluster traffic and storage.
Error messages
No error message was seen
Reproducing the issue
Additional information
null
Are you willing to submit a PR?
- [x] I'm willing to submit a PR!
Currently, there are two clusters that have set up geo namespace replica synchronization, but in one of the clusters, it is found that the messages synchronized from the cluster are far different from the main cluster traffic and storage.
@HGHNice What do you mean with "the messages synchronized from the cluster are far different from the main cluster traffic and storage". Please provide concrete examples of this to make it more understandable.
Currently, there are two clusters that have set up geo namespace replica synchronization, but in one of the clusters, it is found that the messages synchronized from the cluster are far different from the main cluster traffic and storage.
@HGHNice What do you mean with "the messages synchronized from the cluster are far different from the main cluster traffic and storage". Please provide concrete examples of this to make it more understandable.
Currently, I have established a geo bidirectional link between two clusters according to the official website. However, when I produce messages to the main cluster, the expected state should be the same in the storage of the clusters on both sides. But the fact is that there are large storage differences between the two clusters, so I have the doubts in the figure below what other figures I need to provide for analysis
Currently, there are two clusters that have set up geo namespace replica synchronization, but in one of the clusters, it is found that the messages synchronized from the cluster are far different from the main cluster traffic and storage.
@HGHNice What do you mean with "the messages synchronized from the cluster are far different from the main cluster traffic and storage". Please provide concrete examples of this to make it more understandable.
Or it is not that it is not out of synchronization, but that the delay is relatively large. Is this normal? The following is the storage monitoring of two clusters at the same time.
Or it is not that it is not out of synchronization, but that the delay is relatively large. Is this normal? The following is the storage monitoring of two clusters at the same time.
@HGHNice Have you checked which subscription has the backlog that retains messages? Taking a look at that could help find out the reason.
Or it is not that it is not out of synchronization, but that the delay is relatively large. Is this normal? The following is the storage monitoring of two clusters at the same time.
@HGHNice Have you checked which subscription has the backlog that retains messages? Taking a look at that could help find out the reason.
this is master cluster subscription status
this is slave node status
It's obvious is different
slave node
master cluster
Can you guide how to check