atomic-data-docs icon indicating copy to clipboard operation
atomic-data-docs copied to clipboard

Batch Commits

Open theduke opened this issue 3 years ago • 6 comments

There should be a way to change multiple subjects in one commit to guarantee data integrity.

There could either be a new BatchCommit class or a new property on Commit that contains multiple actions , with an action being a nested resource of subject + action (destroy, remove, set, push).

I'd be in favor of adapting Commit to not introduce multiple concepts.

I'd also suggest deprecating having a plain subect + action directly in the commit, but have all commits use actions instead.

theduke avatar Jul 27 '22 12:07 theduke

Example:

{
  "@id": "https://atomicdata.dev/commits/SOME-ID",
  "https://atomicdata.dev/properties/createdAt": 0,
  "https://atomicdata.dev/properties/isA": [
    "https://atomicdata.dev/classes/Commit"
  ],
  "https://atomicdata.dev/properties/commitActions": [
    {
      "https://atomicdata.dev/properties/subject": "some-subject",
      "https://atomicdata.dev/properties/set": {
        "https://atomicdata.dev/properties/shortname": "1611489928"
      }
    }
  ],
  "https://atomicdata.dev/properties/signature": "3n+U/3OvymF86Ha6S9MQZtRVIQAAL0rv9ZQpjViht4emjnqKxj4wByiO9RhfL+qwoxTg0FMwKQsNg6d0QU7pAw==",
  "https://atomicdata.dev/properties/signer": "https://surfy.ddns.net/agents/9YCs7htDdF4yBAiA4HuHgjsafg+xZIrtZNELz4msCmc=",
  "https://atomicdata.dev/properties/previousCommit": "https://surfy.ddns.net/commits/9YCs7htDdF4yBAiA4HuHgjsafg+xZIrtZNELz4msCmc=",
}

theduke avatar Jul 27 '22 12:07 theduke

Thinking about it, this would also enable a marginally cleaner schema by requiring a class for each action.

Eg CommitActionDestroy , CommitActionModify:

This also allows for additional action types in the future.

"https://atomicdata.dev/properties/commitActions": [
    {
       "https://atomicdata.dev/properties/isA": ["https://atomicdata.dev/classes/CommitActionDestroy"  ],
      "https://atomicdata.dev/properties/subject": "some-subject1",
    },
    {
       "https://atomicdata.dev/properties/isA": ["https://atomicdata.dev/classes/CommitActionModify"  ],
      "https://atomicdata.dev/properties/subject": "some-subject1",
      "https://atomicdata.dev/properties/set": {
        "https://atomicdata.dev/properties/shortname": "1611489928"
      }
    }
  ],

theduke avatar Jul 27 '22 12:07 theduke

I like this idea! Also, having actions instead of having a boolean modifier for destroying seems like a good idea. Having multiple subjects also opens up some questions / considerations.

When constructing a version of a resource, we currently use the identifier of the Commit as the identifier of the Resource version. If we allow multiple subjects per Commit, this can no longer work - we need a second parameter (i.e. the subject of the resource itself). This will make version references a bit more verbose, but I don't think that's a big issue.

However, if we also allow multiple commitActions in the same Commit, then we could not refer to intermediate states (e.g. after applying only 3 of the actions instead of all the actions). I don't think that's a big problem, too, but we should be aware of it.

Also I think I'll have to rewrite quite a bit of logic to allow for this - both client and server side. Commits are used throughout the entire application. Most logic resides in the handle_commit function in atomic-server. Rewriting this to work with multiple actions doesn't seem that complex. Executing these multi-Commits atomically (i.e. fail or succeed in its entirety) will pose some challenges, but also doable I think.

Could you perhaps provide some usecases that you had in mind for this?

joepio avatar Jul 29 '22 09:07 joepio

An in-between solution could be still giving each commit it's own identity , and using batch commits just to group individual commits together with a resource array.

I have many use cases, which are all application-centric, not so much related to external data. Every time an application wants to alter multiple subjects atomically there needs to be some batch commit logic, otherwise data integrity can't be guaranteed.

theduke avatar Jul 30 '22 17:07 theduke

An in-between solution could be still giving each commit it's own identity , and using batch commits just to group individual commits together with a resource array.

That is possible, but currently the ID of every commit is its signature. We could sign every Commit individually. But if we want to have one signature for multiple actions, we'd need a different way of thinking of Commit identifiers.

If we think of BatchCommits as a new type of resource, which contains a set of Commits, and only applies them if all of them are valid, I think we're good.

But implementing this will definitely be a bit of a challenge. I think I'll need to create a sled::Transaction or something similar and pass this around. Not impossible, but probably quite a refactor.

joepio avatar Aug 03 '22 15:08 joepio

That is possible, but currently the ID of every commit is its signature. We could sign every Commit individually. But if we want to have one signature for multiple actions, we'd need a different way of thinking of Commit identifiers.

I think this still would work fine: each commit is signed individually, and the batch commit just contains an array with the IDs of the individual commits and is also signed.

An interesting aspect here is also syncing: one might want to expose/sync individual subjects , while batch commits might touch multiple subjects , including synced and unsynced ones. So the sync could only involve individual commits, not batches.

I haven't thoroughly thought this through, but I think batch commits are probably only really relevant locally for a single instance, not for distribution.

theduke avatar Aug 10 '22 11:08 theduke