veramo
veramo copied to clipboard
[proposal] new Storage API
Is your feature request related to a problem? Please describe.
Veramo can currently use a single data source for credentials, presentations and messages; an implementation of the IDataStore and IDataStoreORM interfaces.
Most other veramo top-level plugins use a layered approach, where the top-level plugin acts as a coordinator between multiple lower level implementations of a common interface.
Also, the IDataStoreORM interface is based on a relational data model, with a lot of assumptions about the connections between the data items being stored. This makes new implementations difficult, so a new solution should be adopted.
Describe the solution you'd like A new storage model should support multiple data sources. Adopting such a pattern for storage, would bring it in line with the rest of the API as well as allowing users to store data in multiple locations (local private data, remote backup, remote inbox service, remote public information)
The solution chosen should be able to run queries on the data. Examples of common queries include:
- all credentials issued by an issuer
- issued after a certain date
- containing a certain claim
- matching a certain Type
- all messages of a particular type
- all presentations that include a certain @context
- a VC/VP/message by ID
- ...
Equally important is the ability to filter for credentials using JSONPath matches that are used in the DIF Presentation Exchange and Credential Manifest protocols.
Additional context
Some related projects already use an adapter pattern to support multiple sources:
We're currently upgrading and updating VCManager for our Snap. This is what we've come up with so far.
Idea is to keep structure similar to the one we've used before, and we think covers all the use cases mentioned above.
A couple of notes:
- Filter can be selected (JSONPath, etc.). Most use cases should use JSONPath, but we still wanted to leave room for improvement
- Filtering will be done in
VCStorePluginto keepVCManageras lightweight as possible idused in delete is a sha256 hash of VC as id is not mandatory in every VCsavecan save the same VC in one or more different stores (e.g. save(vc, ['database', 'google_drive'])querywill return all VCs iffilterandstoreare not provided- Function
clearcan be added to remove all VCs from selectedstore
Thank you for the update, this is great!
I really like the fact that queries are propagated to each implementation because this would allow folks to run the query where the data lives instead of centralizing it locally.
I have a few questions:
- is the filter type defining the type of query that follows? You mention
"JSONPath", but would this be something like"couchdb", or"SQL"in other cases?- if so, did you also sketch any error scenarios for when a VCStorePlugin implementation cannot implement a type of query, or when the query fails for some exceptional reason?
- How about storing something else than VCs? like presentations or messages, or anything else that looks like JSON and potentially has an
id. It looks like the interface you defined is not limiting to VCs, which is great, IMO.
And suggestions:
- The result of
query()could be a list of objects containing data+metadata instead of using areturnStoreparameter to select the return type. For example:
[
{data: {<W3CVC>}, meta: {store: "local", id: "asdf"}},
{data: {<W3CVP>}, meta: {store: "remote", id: "fghj"}
]
- All methods of VCStorePlugin should support an
options?: {}parameter for easy customization without interface changes. Something like this could be then used to forward authorization parameters for data stores that require them.
Please correct me if I misunderstood the API you described for filtering
Hey, great feedback!
For the questions:
- Filter type defines the query that follows. Filter can be anything from
'SQL'to'JSONPath'. - Filters should be defined in StorePlugins. If you try to use a type of filter that is not supported in that specific StorePlugin it should throw an error.
- This model should definitely work in more generalized form.
With these questions and suggestions in mind we updated the diagram.
Some notes:
- DataManager
- save -> call save function from the selected
store. If multiple stores are provided savedataobject in all of them - batchSave -> Go through array of objects and join the ones with same
store. call batchSave function for every selectedstoreand save an array ofdataobjects. - query -> call query from the selected
store. If multiple stores are selected, query through all of them and join results. Updatemetafor every result withstore, ifreturnStoreis set to true. - delete -> call delete function from the selected
store - batchDelete -> same as batchSave, go through array and join the ones with same
store. Call batchDelete for every selected store. - clear -> call clear for selected
store.
- AbstractDataStore
- save -> save one
dataobject. Object type should be validated here (e.g. VCStorePlugin should throw an error if object is not of type W3CVerifiableCredential) - batchSave -> save multiple
dataobjects. The type should be validated here. - query -> Should throw an error if
filter.typeis not supported (e.g. SnapVCStore will probably only supportJSONPath). Iffilteris not provided return alldataobjects, else filter through them according tofilter.type. - delete -> remove one object based on
id. - batchDelete -> remove multiple
dataobjects based on theirid - clear -> delete all
dataobjects. iffilteris provided, only delete correspondingdataobjects.
Hope we havent missed anything.
We implemented the proposed DataManager into our Snap (ignore the outdated readme) and created the plugins to cover our needs. We'd love to hear your feedback.
I think it's shaping up really well. To really test it out we'd need to put it up against some real-world queries, and see where it starts to produce friction.
Do you plan to raise a PR to push your implementation to upstream?
I'm a bit late to the party here, but it's great to see this conversation happening.
Filter type defines the query that follows. Filter can be anything from 'SQL' to 'JSONPath'. Filters should be defined in StorePlugins. If you try to use a type of filter that is not supported in that specific StorePlugin it should throw an error.
I'm strongly of the opinion that filter query should remain the same regardless of the underlying storage engine. Otherwise you can't simply switch out different storage engines without impacting the rest of the codebase. This requires Veramo to be opinionated on the type of query format.
One of the things I struggled with was managing the multiple different databases required for the whole library to work effectively. There's a lot of cross-referencing required through out the library. ie: most parts of the library require access to dids. As such, the Veramo singleton needs to easily expose access to those databases in some way.
From the PoC Verida DBManager implementation:
dids: new VeridaDataStoreAdapter(await this.veridaContext.openDatabase('veramo_dids')),
credentials: new VeridaDataStoreAdapter(await this.veridaContext.openDatabase('veramo_credentials')),
presentations: new VeridaDataStoreAdapter(await this.veridaContext.openDatabase('veramo_presentations')),
claims: new VeridaDataStoreAdapter(await this.veridaContext.openDatabase('veramo_claims')),
messages: new VeridaDataStoreAdapter(await this.veridaContext.openDatabase('veramo_messages'))
Now, it makes sense that all these are needed, but I feel that specifying all these in a single DbManager class is a bit backwards.
My preference would be to have separate generic classes with their own storage configuration. For example, something like this:
veridaDbCredentials = {}
sqlDbCredentials = {}
veramoConfig = {
dids: {
// A single datastore
datastore: new VeridaDatastore('did', veridaDbCredentials)
},
credentials: {
// Multiple datastores
datastore: [new VeridaDatastore('credentials', veridaDbCredentials), new SQLDatastore('credentials', sqlDbCredentials)]
}
}
// Generic query interface regardless of the storage engine
VeramoQueryInterface {}
// Generic storage engine interface for all use cases
DatastoreInterface {
async save()
async query(query: VeramoQueryInterface)
async delete()
...
}
// Two separate storage engine implementations
class VeridaDatastore implements DatastoreInterface {}
class SQLDatastore implements DatastoreInterface {}
The Veramo components don't need to expose access directly to the underlying storage engines, but provide appropriate interfaces to query, save, delete etc. (similar to what currently happens) and encapsulate any other logic required. I noticed how adjacent metadata was created in different databases for a single action, so it's important this continues where necessary. Exposing the storage engines directly would risk breaking this type of logic.