orbitdb
orbitdb copied to clipboard
chunked entry loading and traversal
This issue is being created to track exploration of chunked entry loading and traversal. This is mentioned in Road to 1.0 as a part of sections Async Iterators / Streaming and Hot/Cold Data Separation (in-memory vs. on-disk data).
With all OrbitDB databases, updates are added as entries to a log. Currently all of the entries to the log live in memory the whole time the database is open. While this might be preferred in some cases, it may be terribly inefficient in others.
We need to find a way to do this while still maintaining speed of the operations that require it.
What is entry loading and traversal?
In this context, entry loading refers to loading entries into memory, and traversal is reading the log in its complete order. The goal is to do this in chunks, meaning loading and traversing sections of the log at a time.
With all the entries in memory traversing the log is pretty simple and fast. You start from the latest entries, which reference the latest entries before them, and travel reading down the log as needed. There is no need to check for verified signatures, access control or how to handle missing entries as they've already been loaded as part of the log.
When entries are loaded and traversed in chunks these issues need to be managed well for traversal to remain efficient. I'm confident this is possible; and with optimizations to index calculation, specifically recalculation could become snappier in most cases.
On to possible solutions for chunked traversal, we should start with the most basic solution.
We track the log heads, tails, and missing entries' CIDs; stored as mutable values somewhere.
When loading an entry from CID it is checked for:
- log membership marker
- valid signature
- valid entry interface
If any check does not pass the entry is discarded.
The entry is then checked for log access, and discarded if it does not pass. The entry is now loaded and considered part of the log, ready to be read by traversal.
When traversing entries, we start from the entries' CID in the heads, load and sort them. Repeat with next layer of predecessors from next field only, and continue this way until we can find no more predecessor entries. We never attempt to load entry CID existing in the missing set, this is so we don't get stuck during traversal by trying to load a remote entry.
(There is a specified total order for traversal which isn't important here, the causal links are what matters for this)
This solution could be fine in practice but may not be sufficient in some situations, or may have flaws I haven't considered. It may be possible to efficiently track whether a signature has been validated already, CID edge sets, or the entire entry set. Luckily this does not affect the protocol at all, so people could upgrade as we find better ways to do this without affecting their databases.
Do I understand this correctly:
Instead of loading the entire log into memory, it is loaded in chunks.
But what is the benefit of this if in the end and for updateIndex you still have to provide a full index in order to be successful? In the end the only benefit I can see, is that you'd only have to hold a few chunks of the oplog in memory at once - with the rest remaining on disk.
Or do you propose to change the Index interface too? Maybe using functions that don't expect the whole index but only single entries to be applied as they are received and checked as you described above?
Another thought I'd like to enter here again, because I discussed this already once on Matrix, is the idea of Lazy Indexes - in essence these indexes only parse the oplog once data has been requested through one of the get methods. It might be possible to extend this notion to include not just parsing but also fetching from memory.
Why should a node need to keep an entry in memory that it has no need for? And once an entry has been parsed once, it could be stored in "hot" memory for faster later lookups.
Admittedly this idea is not yet a full proposal and would perhaps require rewriting the API for Stores & Indexes, turning their getters into async functions, but it might be worth it for larger databases.
Third, and lastly, I'd like to ask if it is really advisable to try and optimize for larger OrbitDB Databases? If you were in the standard SQL or NoSQL DB world, this would make sense, but in OrbitDB a single database more or less corresponds to a single row in a SQL Table or document in a NoSQL DB rather than a full DB in these paradigms.
Maybe OrbitDB should be more focused on making this design pattern, as already employed in the Field Manual Tutorial, obvious and optimize for that instead of for large DBs? Maybe by encouraging the storage of classes and other objects with finite fields in DBs, similar to ORMs?
Or do you propose to change the Index interface too? Maybe using functions that don't expect the whole index but only single entries...
yes the interface will need to be changed to an async iterative one most likely
the idea of Lazy Indexes - in essence these indexes only parse the oplog once data has been requested through one of the get methods...
I believe this is suggested in 1.0 as well and yes would require turning read methods async.
Third, and lastly, I'd like to ask if it is really advisable to try and optimize for larger OrbitDB Databases? If you were in the standard SQL or NoSQL DB world, this would make sense, but in OrbitDB a single database more or less corresponds to a single row in a SQL Table or document in a NoSQL DB rather than a full DB in these paradigms.
Maybe OrbitDB should be more focused on making this design pattern, as already employed in the Field Manual Tutorial, obvious and optimize for that instead of for large DBs? Maybe by encouraging the storage of classes and other objects with finite fields in DBs, similar to ORMs?
I like this point. Partitioning databases seems like it could be useful for a lot of cases. I'd love to see more tools that make this easier. Keeping entries in memory only temporarily doesn't just optimize for larger databases.
I'm sure you are familiar with the community chat complaints we used to get of memory hogging, especially on memory limited cloud instances; although i dont think its all orbitdb land this is one area that we could improve that metric. I think it's also important to keep the option to have all entries stay in memory.
On the point of the new Interface.
Based on your elaboration above I think this would look a little like this:
class Index {
constructor() {
this._index = {}
}
updateIndex(oplog) {
this._index = {}
for await (const entry of oplog.values) {
this._index[entry.payload.key] = entry.payload.value
}
}
}
Partitioning databases seems like it could be useful for a lot of cases.
I'll try to elaborate on an OrbitDB Object-Relational Mapping (ORM) and other applications of partitions in another issue.
How do you want to go about implementing chunking?