horaedb
horaedb copied to clipboard
Multiple writer for the same sst caused by `close shard`
Describe this problem
Shard will be moved from nodes when process panic because if for any reason, all operations related to such a shard should be stopped before moving(especially the write operations
).
However background works(flush, compaction, all of them are writes
) will not be stoppend rightly before now. That caused a serious bug : multiple writers for one sst.
Server version
CeresDB Server Version: 1.2.2 Git commit: 2e206650 Git branch: main Opt level: 3 Rustc version: 1.69.0-nightly Target: aarch64-apple-darwin Build date: 2023-06-12T13:01:03.592984000Z
Steps to reproduce
Hard to reproduce, if must do this, steps following may can work:
- Setup a ceresdb cluster with ceresmeta.
- Trigger compaction/flush work for a specific table of shard in one node manually.
- Move the shard to another node by ceresmeta manually before comapction/flush work finishing.
- Trigger compaction/flush work for the table of shard manually in the new node.
Expected behavior
No response
Additional Information
No response
After #998, the updates following the closing shard will be forbidden. However, some ssts may be still being written when close the shard, while these ssts may share the same ids with the new ssts created by the new node, leading to the multiple writers on the same sst.
Let's fix this problem in another PR. @baojinri
After #998, the updates following the closing shard will be forbidden. However, some ssts may be still being written when close the shard, while these ssts may share the same ids with the new ssts created by the new node, leading to the multiple writers on the same sst.
Let's fix this problem in another PR. @baojinri
#1009 has fixed the problem. However, #998 actually didn't achieve the goal to prevent updates after table is closed. And #998 has been reverted, I guess I'll submit another change set to make all things work.