ssdb icon indicating copy to clipboard operation
ssdb copied to clipboard

ssdb with replication type=mirror systematically goes OUT_OF_SYNC after syncing the same amount of data

Open saveriocastellano opened this issue 4 years ago • 4 comments

hello,

we are using two ssdb instances (running on different machines) that sync with type=mirror, masterA - masterB. Actively we use only one of them, masterA. Currently masterA contains approx 96GB of data. After shutting down masterB and deleting all it's data, we restarted ssdb and let it sync from scratch again. After masterB reached 56,7 GB it went OUT_OF_SYNC. What is strange is that if we repeat the experiment, and let it sync again from scratch it always goes OUT_OF_SYNC after exactly 56,7GB of data has been synced. So the problem seems to be caused by data rather than heavy load or network inefficiency. We have already set a high value for the binlog capacity and done network tests to make sure there are no network related bottlenecks. Until a week weeks ago the two nodes were syncing fine, and nothing has changed since then apart from adding less than 10% new data to masterA.

Can you please help us and suggest us what could be the problem?

saveriocastellano avatar Apr 22 '20 11:04 saveriocastellano

When a new node(masterB) added, masterA will iterate over the whole leveldb database to make a full dump(snapshot). leveldb may do compaction at this time, I have seen some cases, but I am not sure if it is a bug of leveldb, that compaction may blocks iterating.

However, force leveldb to do compaction(compact in ssdb-cli) before adding the new node would be a good idea.

ideawu avatar Apr 22 '20 11:04 ideawu

thanks for the reply.

There is only one problem: because right now masterB is not synced, then we only have all data in masterA and that's the master we are actively using to serve players in the live system. I'm afraid that if I run compaction on masterA it will make the live system crawl for 2-3 hours.. or what do you think ?

saveriocastellano avatar Apr 22 '20 11:04 saveriocastellano

Run compact when the server is not busy.

ideawu avatar Apr 22 '20 12:04 ideawu

??? What kind of reply is that??? My server is used 24 hours a day, it’s a service.. so it is always busy.

saveriocastellano avatar Apr 22 '20 12:04 saveriocastellano