Make Go client tolerant to replication lag
Related: https://github.com/line/centraldogma/issues/435
A client sometimes gets different entries from different replicas in the same Central Dogma cluster, mainly due to replication lag, i.e. Even if you learned that the latest revision of a repository in replica A is 42, you may fail to get a file at revision 42 from replica B.
Currently, in such a case, our Go client simply fails with 404, which is confusing to our users.
We could instead keep the latest known revision numbers of the repositories and retry a few more times when it is certain that 404 error was due to the replication lag.
Any update?
We're trying list the files immediately after receiving an event from watch repository API. When we update some files in the repository, some pods got latest files from the repository successfully, but rest of some pods failed to get last updated files.
It seems timing issues caused by replication lags. So we may need the replication lag tolerant client as you said.
Hi @mingrammer! Sorry for making you get in trouble. 😅 Yeah, we definitely need that. I think we could probably work on implementing this feature next month. Or, are you interested in sending a PR that adds the feature? We have the Java version of the client so it wouldn't be that hard. 😆
@minwoox Thank you for reply. Yep I'm willing to contribute for this.
Thanks for volunteering for this issue. Please let us know if you need anything. 😆