lmdbxx
lmdbxx copied to clipboard
Cursor.get returns key and value concatenated into key variable
Weird output from cursor get: I would get both key and value inside the key variable, then the value inside the value variable :
#include <cstdio>
#include <cstdlib>
#include <lmdb++.h>
using namespace lmdb;
int getsize(const lmdb::env& e){
auto t = lmdb::txn::begin(e.handle(), nullptr, MDB_RDONLY);
auto d = lmdb::dbi::open(t, nullptr);
int r=d.size(t);
t.abort();
return r;
}
int main() {
auto env = lmdb::env::create();
env.set_mapsize(1UL * 1024UL * 1024UL * 1024UL); /* 1 GiB */
env.open("./example.mdb", 0, 0664);
{
auto wtxn = lmdb::txn::begin(env);
auto dbi = lmdb::dbi::open(wtxn, nullptr);
char a[6] = "hello";
dbi.put(wtxn, "email", "hello");
dbi.put(wtxn, "key", "value");
dbi.put(wtxn, "user", "johndoe");
wtxn.commit();
}
{
auto rtxn = lmdb::txn::begin(env);
auto dbi = lmdb::dbi::open(rtxn, nullptr);
auto cursor = lmdb::cursor::open(rtxn, dbi);
lmdb::val k, v;
while(cursor.get(k, v, MDB_NEXT)){
printf("We got '%s'\nValue '%s'\n", k.data(), v.data());
}
}
{
std::printf("size is %d\n", getsize(env));
}
return EXIT_SUCCESS;
}
Expected Output:
We got 'email'
Value 'hello'
We got 'key'
Value 'value'
We got 'user'
Value 'johndoe'
size is 3
Output :
We got 'emailhello'
Value 'hello'
We got 'keyvalue'
Value 'value'
We got 'userjohndoe'
Value 'johndoe'
size is 3
It looks like you are assuming your values will be NUL-terminated:
printf("We got '%s'\nValue '%s'\n", k.data(), v.data());
But your put
s above aren't writing the NUL byte.
Don't use the C string routines since you won't be able to store NUL bytes in your keys/values. Instead do something like:
std::string myKey(k.data(), k.size());
std::string myValue(v.data(), v.size());
std::cout << "We got " << myKey << " and " << myValue << std::endl;
(untested)
Or, better yet, upgrade to C++17 and use my fork and get the string_view
hotness :)
I know about string_view and about your fork, but I have constraint that prevent me from going above c++11. Since you're here, could you tell me how can I force my database to sorted by input order (instead of lexicographically). Also on first creation of database, it always crashes with "bus_error", but if I open the environment, then close it then open it again and execute my code, it works.
EDIT: How can I make sure my puts do add the null byte, in order to avoid this problem?
-
Make it so your keys increase every time you insert, or use a secondary index
-
I don't know, I'd need code to reproduce it. Make sure you are creating a big enough MAPSIZE, and the same mapsize each time. Make sure you commit the transaction that creates the tables.
-
I suggest not storing the NUL byte. It's a waste of space since LMDB tracks size anyway. Furthermore, it indicates you are unable to store NUL bytes as part of your keys or values (as I described above)
How can I use a secondary index?
Every time you insert into your main table, you also insert into another table. In this secondary table, the key is an increasing integer, and the value is the key into your main table.
Then when you want to iterate over your items in insertion order, iterate through the secondary index. For each value, use it as a key to look up the item from the main table.
I'm sorry for all these questions but I can't find where else to ask them, is it possible to get the position of an entry without iterating through my database with a cursor and then incrementing a counter?
It's OK. What do you mean get the position? You can position the cursor directly to a know key with cursor ops like MDB_SET
, or go to somewhere nearby with MDB_SET_RANGE
.
If you mean get an item by its postion index (ie, get the 10th element in the DB) then afaik this is not possible. If you're always appending to the DB (and never removing or inserting in the middle) then you could maintain the position index in a secondary index.
I meant, doing a dbi.get()
to search for the position index of a certain key, (e.g inputting "user_email" and receiving 10). Is this possible without holding duplicate databases with secondary indexes?
AFAIK, no.
I'll see which one would be more cost effective, iterating through the database and counting, or keeping a duplicate of each concerned database with indices.