go-ds-crdt icon indicating copy to clipboard operation
go-ds-crdt copied to clipboard

Introduce Snapshotting to Optimize CRDT DAG Compaction and Traversal

Open mgazza opened this issue 1 year ago • 1 comments

Hi, I've been playing with this library and it's great. However, I'm not happy about the DAG's continuous growth, and I'd like to help.

The current implementation of go-ds-crdt maintains a DAG structure to represent state changes in a datastore. While effective, this approach has limitations:

Performance Overhead: Traversing and replaying large DAGs to compute the current state becomes costly as the DAG grows. Garbage Collection: Stale or redundant nodes in the DAG are retained unnecessarily, increasing storage requirements. Recovery Times: Rebuilding state after a crash or restart is slow, as it requires replaying the full DAG history. Snapshotting introduces a mechanism to address these issues. By creating a periodic snapshot of the DAG, we can:

Compact the state into a "base snapshot." Limit traversal to only the most recent nodes. Facilitate faster recovery by replaying transactions only since the last snapshot.

mgazza avatar Nov 21 '24 14:11 mgazza

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review. In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment. Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

  • "Priority" labels will show how urgent this is for the team.
  • "Status" labels will show if this is ready to be worked on, blocked, or in progress.
  • "Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

welcome[bot] avatar Nov 21 '24 14:11 welcome[bot]

Hi, I see that this was already implemented in #288 but closed shortly after opening. @mgazza is it still in works?

cryi avatar Sep 14 '25 08:09 cryi

Hi @cryi yes, we implemented this, but then just really pushed the issue down - you end up needing a Merkle DAG of snapshots and that sort of perpetuated the problem. We're currently experimenting with a slightly different mechanism which does away with the need for snapshots, and that is the CL-SET. The repo is private right now, but we will OS it as soon as we can. We plan to make the interface compatible with go-ds-crdt.

mgazza avatar Sep 15 '25 09:09 mgazza