dwn-sdk-js
dwn-sdk-js copied to clipboard
Add a '$singleton' option for graph points to keep only the latest single record
Many times a graph point is intended to only have one record stored, never multiple. If we added a '$singleton' boolean option it would allow the dev to specify that an object in the graph was to only have one record, the latest one, kept and all others discarded. Thing of the case of having a blog protocol where you have an index
html record where you only ever want one, don't really care about the recordId, and just want the latest kept. The $singleton option would make that possible.
- Can
$singleton
apply to records other than the root record?
- If yes, how does that work? Say
foo/bar
is a$singleton
. Can there only ever be one record at pathfoo/bar
across contexts? If I try to write a newfoo/bar
, does that get rejected or delete the existingfoo/bar
, which may be in a different context. - If no, we need to validate upon protocol ingestion. It seems like the point of
$singleton
is to restrict a protocol to only one context.
- I want to unpack this phrase
just want the latest kept
. If I have$singleton
for protocol pathfoo
, and a tree of descendant records below it, will writing a new record to pathfoo
delete the entire existing tree? That's pretty drastic and dangerous behavior.
I've been working on how this would work and have some WIP PRs around it.
As discussed with @csuwildcat and I believe suggested by @diehuxx, a "$keep" property that is a positive integer greater than 0 rather than a "$singleton" property to only denote 1 would bring better flexibly.
Some thoughts on @diehuxx's questions above:
Yes, I think"$keep" records should be able to be nested.Currently when a new record is written the older ones are purged.
However the reject path is also interesting, and I could see how that might be useful. Maybe behind a protocol definition property that denotes the behavior of $keep, with default being purge. Maybe something like this:
"$keep" : {
"limit" : 1,
"strategy": "purge" | "reject"
}
For anything that is a child of a protocol context, the parentId and contextId are required upon creation of that record, currently I "$keep" the limit number of records within that context.
So you could have "foo/bar" with a a "$keep" limit of 5, you would then keep 5 "foo/bar"s for each parent instance of "foo".
If you had 5 "foo" records, you would then have a total of 25 "foo/bar" records. If you query only on protocolPath 'foo/bar' without a contextId you will get all 25 records.
Would like to get some input on this.
@LiranCohen Looks good!
We discussed at office hours. I'll summarize:
- I like the name
$keep
and the structure you proposed. - We should leave out
strategy
for now.reject
isn't worth implementing now (ever?). The inevitable DevEx forreject
is bad because. - My remaining concern is about how to implement in a way that accommodates sync. In particular how to make purging a record tree performant. When a record is purged, all of its descendants in the protocol are also purged. When purged, the record and its descendants must be deleted from the event log. How do efficiently we get the message CIDs of all descendants?
Or maybe just $limit
as it was originally brought up?
@thehenrytsai I think they like keep
because it implies purging/retention.
@csuwildcat, I see, fair, so: $keep: 1
seems reasonable.