yorkie
yorkie copied to clipboard
Add `--backend-snapshot-with-purging-changes` flag
What happened: Hey there! Yorkie seems really really cool - exactly what I've been looking for for a niche project of mine. Very cool to finally find a project that has support for realtime collaborative editing but isn't just the client side with everything else left to create from scratch.
I've got it running remotely behind envoy and am able to sync docs etc. Trying now to look at some performance and such for the database (using mongodb) and noticed that it doesn't look like garbage collection is happening after a snapshot is taken. Or it's possible I don't fully understand the GC behavior properly 😓 😄
What you expected to happen:
I've been using the Text CRDT type to do collaborative text editing, like so:
doc.update((root) => {
if (content) {
if (!root.content) {
const text = root.createText("content");
text.edit(content.start, content.end, content.text);
} else {
root.content.edit(content.start, content.end, content.text);
}
}
});
After 1000 edits, there are 1000 docs in the changes
collection. I can see a snapshot being taken like so:
INFO SNAP: '61d5e45922aadcfafba4606f$61d94aa4a3ff7fe9024d9cc9', serverSeq: 1010
INFO RPC : "/api.Yorkie/PushPull" 653.961929ms
INFO PUSH: '61d94aab0035805841153d7a' pushes 12 changes into '61d5e45922aadcfafba4606f$61d94aa4a3ff7fe9024d9cc9', rejected 0 changes, serverSeq: 1028 -> 1040, cp: serverSeq=1040, clientSeq=1040
However, it still seems like there are the same number of documents in the collection, which is surprising to me. I would've thought that the snapshot would remove the need for the entire history of changes made so far (maybe I'm off on that) or at least that some changes would've been removed as unnecessary. I noticed that there was a recent release that specifically enabled this to happen on snapshots, so wanted to file this issue in case either I'm missing something or it's a real bug. Thank you!
How to reproduce it (as minimally and precisely as possible):
see above
Anything else we need to know?:
Environment:
- Operating system: linux (amd64 docker image)
- Browser and version: n/a
- Yorkie version (use
yorkie version
): 0.2.1 - Yorkie JS SDK version: 0.2.0
Thank you for your interest in Yorkie project. And I'm glad to hear that you agree with our project's goal, Just out of box
.
Currently implemented GC is focused on removing tombstones in CRDT. So, the GC is removing the tombstones from the snapshot, not deleting the changes.
- Briefly about tombstone: the following website
- Snapshot size after GC: https://github.com/yorkie-team/yorkie/pull/287
However, as you mentioned, we can also delete changes before the snapshot creation time, because after creating a snapshot, we can use the snapshot to rebuild a specific state of the document. This can prevent the changes collection from continuing to grow.
I think it would be good to add a flag to delete previous changes after snapshot creation. --backend-snapshot-with-purging-changes
ahhh I see - that makes sense. so the document itself is compressed versus the history of the document. Makes total sense!
I would tend to agree re: the purge changes flag. It seems like you would run into extremely high document counts with any kind of significant usage around text editing especially. As long as the option is tunable, it seems like it would be a great idea to have 👍
Could I try this issue?
@chromato99 Sure. If you have any questions please let me know.