Max Key Size
My keys are URLs, and many are exceeding max key size (511 bytes). Maybe I could make a hash of the URL, as the key, and the URL itself as the value. Would be nice to have an adjustable key size.
According to AI:
The maximum key size in LMDB (Lightning Memory-Mapped Database) is generally not configurable through Nim wrapper libraries like limdb at runtime or via simple settings.
Key Points:
Compile-Time Constant: The limit is determined by a compile-time constant, typically named MDB_MAXKEYSIZE, within the underlying C LMDB library itself. This constant is set when the C library is compiled, and it usually defaults to 511 bytes.
Nim Wrapper Role: The Nim wrapper library simply uses the C library it's linked against. It can query the max key size the C library reports, but it cannot change it after the C library has been compiled.
Changing the Limit (Advanced): To increase the limit, you must:
Get the LMDB C library's source code.
Modify the value of the MDB_MAXKEYSIZE constant in the C source files (often in lmdb.h).
Recompile the C library from the modified source.
Ensure your Nim application links against this newly compiled custom C library instead of the standard one.
Considerations: This process is complex, and databases created with different max key sizes might have compatibility issues. It's crucial that all programs accessing the same database use an LMDB C library compiled with the same MDB_MAXKEYSIZE.
Common Alternatives: Due to the complexity and compatibility concerns, it's often more practical to work within the default limit by using techniques like:
Hashing longer keys and using the hash as the database key.
Skipping records with keys that exceed the limit.
Redesigning the key structure to ensure keys fit.
In essence, changing the max key size is a modification to the underlying C library, not a setting within the Nim wrapper.
Thanks for your report!!
Interesting- I didn't know that. A quick google search confirms the AI's opinion, there is indeed a MDB_MAXKEYSIZE compile time flag.
I think the most straight forward solution for you would be to build your own LMDB with the MDB_MAXKEYSIZE you want and install it on your system.
It's very straight forward.
git clone https://github.com/LMDB/lmdb.git
cd lmdb/libraries/liblmdb
make CPPFLAGS=-DMDB_MAXKEYSIZE4=2047
sudo cp liblmdb.so /usr/local/lib
sudo ldconfig
Should be in the docs, I'll leave the issue open until then.
It would also be interesting to add a static version of lmdb- you could do that yourself in your nim code by compiling with --dynlibOverride=lmdb and then having a {.compile "lmdb.c".} pragma but that would probably need to be fiddled with a little bit- if you do go this route, nim-lmdb` would most likely love a patch.
Carlo, thanks for the quick reply and clear instructions. We now know the default limit, this will be a deciding factor. I looked into it and some key:value databases can have large keys, some smaller keys. There are performance and resource issues. Memcache has a default of 250 bytes. Redis up to half a gigabyte. I won't modify lmdb for maintenance reasons, but will look at this as a Table-like interface to a key:value database, where the index must be a hash, or some other short value.
Sure!! That's some great info. Large keys with redis is quite remarkable, I had no idea.
You might want to consider having a distinct type for the hash to prevent the class of bugs where you index using unhashed strings.
import limdb
# new type
type
SomeHash = distinct string
# limdb type implementation
template toBlob(h: SomeHash): Blob =
h.string.toBlob
template fromBlob(b: Blob): SomeHash =
b.fromBlob(string).SomeHash
template compare(a, b: SomeHash): int =
if a.cstring < b.cstring:
-1
elif a.cstring > b.cstring:
1
else:
0
# database code
let db = initDatabase("somePath", (SomeHash, string))
db["foo".SomeHash] = "bar"
# this would not compile
# db["foo"] = "bar"