pulsar icon indicating copy to clipboard operation
pulsar copied to clipboard

[Bug] Synchronization using geo's namespace-level replicas Message lost

Open HGHNice opened this issue 7 months ago • 5 comments

Search before reporting

  • [x] I searched in the issues and found nothing similar.

Read release policy

  • [x] I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.

User environment

pulsar version 3.0.11

Issue Description

Currently, there are two clusters that have set up geo namespace replica synchronization, but in one of the clusters, it is found that the messages synchronized from the cluster are far different from the main cluster traffic and storage.

Error messages

No error message was seen

Reproducing the issue

Image

Image

Image

Image

Additional information

null

Are you willing to submit a PR?

  • [x] I'm willing to submit a PR!

HGHNice avatar May 30 '25 08:05 HGHNice

Currently, there are two clusters that have set up geo namespace replica synchronization, but in one of the clusters, it is found that the messages synchronized from the cluster are far different from the main cluster traffic and storage.

@HGHNice What do you mean with "the messages synchronized from the cluster are far different from the main cluster traffic and storage". Please provide concrete examples of this to make it more understandable.

lhotari avatar May 30 '25 16:05 lhotari

Currently, there are two clusters that have set up geo namespace replica synchronization, but in one of the clusters, it is found that the messages synchronized from the cluster are far different from the main cluster traffic and storage.

@HGHNice What do you mean with "the messages synchronized from the cluster are far different from the main cluster traffic and storage". Please provide concrete examples of this to make it more understandable.

Currently, I have established a geo bidirectional link between two clusters according to the official website. However, when I produce messages to the main cluster, the expected state should be the same in the storage of the clusters on both sides. But the fact is that there are large storage differences between the two clusters, so I have the doubts in the figure below what other figures I need to provide for analysis

Image

HGHNice avatar Jun 03 '25 03:06 HGHNice

Currently, there are two clusters that have set up geo namespace replica synchronization, but in one of the clusters, it is found that the messages synchronized from the cluster are far different from the main cluster traffic and storage.

@HGHNice What do you mean with "the messages synchronized from the cluster are far different from the main cluster traffic and storage". Please provide concrete examples of this to make it more understandable.

Or it is not that it is not out of synchronization, but that the delay is relatively large. Is this normal? The following is the storage monitoring of two clusters at the same time.

Image

Image

HGHNice avatar Jun 03 '25 12:06 HGHNice

Or it is not that it is not out of synchronization, but that the delay is relatively large. Is this normal? The following is the storage monitoring of two clusters at the same time.

@HGHNice Have you checked which subscription has the backlog that retains messages? Taking a look at that could help find out the reason.

lhotari avatar Jun 03 '25 19:06 lhotari

Or it is not that it is not out of synchronization, but that the delay is relatively large. Is this normal? The following is the storage monitoring of two clusters at the same time.

@HGHNice Have you checked which subscription has the backlog that retains messages? Taking a look at that could help find out the reason.

this is master cluster subscription status Image this is slave node status Image It's obvious is different

slave node Image master cluster Image

Can you guide how to check

HGHNice avatar Jun 04 '25 01:06 HGHNice