sophia icon indicating copy to clipboard operation
sophia copied to clipboard

API question: cursors and databases

Open os12 opened this issue 6 years ago • 11 comments

Hi Dmitry, I've started stated testing Sophia and hit an issue: a cursor-based walk returns keys across databases. Assume two databases, "db1" and "db2". One of them is opened and keys are fetched using a cursor:

    void *env = sp_env();

    sp_setstring(env, "sophia.path", ".store", 0);
    sp_setstring(env, "db", db_name.c_str(), 0);
    if (sp_open(env) == -1)
        return FromStringError(LastSophiaError(env));

    void *db = sp_getobject(env, ("db." + db_name).c_str());
    if (db == nullptr)
        return FromStringError(LastSophiaError(env));

    void *cursor = sp_cursor(env);
    void *doc = sp_document(db);
    sp_setstring(doc, "order", ">=", 0);

    while ((doc = sp_get(cursor, doc))) {
        int size;
        auto key_ptr = static_cast<const char *>(sp_getstring(doc, "key", &size));
        ...
        ...
    }

Am I missing something here? Do cursors go across databases? If so, how do I provide an additional filter to limit iteration to a single DB?

Basically, I can achieve the behavior I want by splitting the databases at a higher level this way:

    sp_setstring(env, "sophia.path", (".store/" + db_name).c_str(), 0);
    sp_setstring(env, "db", "mydb", 0); // hard-coded name

Could you clarify the design behind the aforementioned "path "and "db" settings please?

Thanks a lot! Oleg.

os12 avatar Jan 15 '18 18:01 os12

Hi, not quite sure what the issue in here. There is no way records can overlap this way, only if you use different db pointers during sp_document() creation. Are you sure it sets correct name for sp_getobject()?

sophia.path specifies main directory, where all databases will be stored by name. For example, db.mydb will be placed in sophia.path/mydb

pmwkaa avatar Jan 16 '18 18:01 pmwkaa

Right, both the cursor-based "read" path and the transaction-based "write" path start with the same code (the first fragment). I do see per-database subdirectories created inside .store as well as log (which seems to contain the last few mutations).

I know you have tests that cover multi-db mutations. Yet, do you have a multi-db cursor test?

os12 avatar Jan 16 '18 22:01 os12

Could you please specify your work case you are trying to achieve? One cursor is designed to iterate only one database at a time. If you need to make a join between databases, you need to open a second cursor, and so on.

pmwkaa avatar Jan 17 '18 15:01 pmwkaa

Could you please specify your work case you are trying to achieve? One cursor is designed to iterate only one database at a time

Yes, that's what I expected. Yet the cursor returns keys from an adjacent DB, which is very surprising.

os12 avatar Jan 17 '18 19:01 os12

Please see #156 for the repro - a cursor-based walk in DB2 should find no keys (yet it fetches everything from the DB1).

os12 avatar Jan 19 '18 02:01 os12

Now i understand what you meant. This is known issue, which might be quite confusing. Sophia does not have persistent catalog of created databases, they must be predefined each time before sp_open(). This is done to support easier Sophia integration with DBMS's which have its own database catalog.

Open procedure will find and recover every specified database directory by its name. After that it will reply write-ahead log records. To identify database in write-ahead log db.name.id is used: http://sophia.systems/v2.2/conf/db.html. If database is not specified it will be set in sequential order of database definition.

pmwkaa avatar Jan 22 '18 10:01 pmwkaa

Umm... I don't follow. Is there a way for me do deal with this issue?

  • My application needs to read/write distinct DBs at random times (otherwise, I'd have to impose prefixes onto record names to fake that)
  • The "read" is usually started by opening a DB by name and then listing records using a cursor (they must be ordered).

If this does not work, I will have to create distinct directories to isolate the DBs... Are there downsides?

os12 avatar Jan 22 '18 19:01 os12

Sophia has a single write-ahead log which is shared between databases to efficient support cross-database transactions. To match a database in the log db.name.id is used.

sp_open() will fail if it will not find a defined database during log reply. In other words, scheme must be exactly the same every time you open environment. Basically, if you define mydb0 and mydb1, next time you need to define them again in the same order (or set db.name.id to distinct them).

Keep in mind, that open operation is very time consuming. Keep a database open, if you need good timing.

pmwkaa avatar Jan 22 '18 20:01 pmwkaa

OK, I've extended the code to set db.db1.id bit it's failing with this:

sophia/runtime/sr_conf.c:100 db.db1.id is read-only

Could you take a look at my second commit in #156 please? What am I missing?

os12 avatar Jan 22 '18 23:01 os12

Also, let me use this comment to collect misc notes/comments:

  1. [minor] The list of options at http://sophia.systems/v2.2/conf/db.html is hard to understand as it mentions things like db.name.id
    • it should really say something like db.$(db_name).id as the other string is meaningless
  2. [major] The across-db nature of the log file makes things very tricky:
    • it should be documented somewhere in BOLD
    • these DBs are not actually independent (as they share the write-ahead log) and act more like namespaces
    • the code is hard to use in environments that create databases on the fly
      • the user must maintain an ordered list of DB names and open the entire set every time
      • a new DB can only be added during Environment initialization and only as a last one
        • sp_open() is expensive and so the DBs should not be closed
        • yet the environment must be torn down and recreated in order to add a new DB

os12 avatar Jan 22 '18 23:01 os12

My fault, sorry about that. I completely forgot that custom id generation scheme been removed in v2.2 (so recommended scheme was actual only for previous version). Right now, the only way to make this work is to always define same databases in same order. I was planning to completely remake it.

It is also possible to force compaction and wait for its completion, this will guarantee that write-ahead records will not have database references. This will also increase open time dramatically.

Could you describe your use-case in more details? i might have some thoughts how to make it in optimal way.

pmwkaa avatar Jan 24 '18 10:01 pmwkaa