rkv
rkv copied to clipboard
document various limitations of LMDB
We might want to document various limitations of LMDB in order to offer the "least surprise" for the rkv users. Off top of my head, LMDB has following limitations:
- Max key size (
MDB_MAXKEYSIZE = 511 bytes), it also applies to the value for the dupsort store. Note that it's a compile-time configuration, and can't be changed at runtime - Max environment size (default as 10 MB), once the environment is full, following writes will fail
- Max number of database (default as 5), hosting a moderate number (say up to a few dozens) of databases in a single environment is fine, however, hosting too many has both memory and performance impact
- Max readers (default as 126), the maximum readers/threads allowed to access an LMDB environment. Passing this limit will end up failing to create readers
- Too many writes in a single transaction. Unsure what exactly the maximum writes is, but LMDB may complain while conducting a bulk load write in a single transaction. To workaround this, we can periodically commit a transaction and create a new one for the rest of writes.
- Writes are serialized by a write lock, which means that there is only one active writer allowed at any point of time, and other writers will be blocked until that active writer aborts/commits. As such, favor the short write transactions in the multi-writer scenario, also mind the blocking for those inactive writers.
Max environment size (default as 10 MB), once the environment is full, following writes will fail
The LMDB docs say that the default is 10MB, but the code actually sets DEFAULT_MAPSIZE to 1MB:
https://github.com/mozilla/lmdb/blob/4f9fe9fcead4ce3c49d80d39526472b5c274b188/libraries/liblmdb/mdb.c#L729
Another limitation it would be useful to document is the maximum size of a value. Presumably it's the MAXDATASIZE of 0xffffffffUL (4,294,967,295).