couchbase-lite-core icon indicating copy to clipboard operation
couchbase-lite-core copied to clipboard

Lazy vector index updating

Open snej opened this issue 4 months ago • 2 comments

Adds API that enables asynchronous computation and updating of a vector index. Useful when generating the vector for a document requires a network request or is slowing down document updates.

To create a "lazy" vector index that is updated asynchronously:

  1. Set the (new) C4VectorIndexOptions.lazy property to true.
  2. The indexSpec parameter -- the expression to be indexed -- should be not the vector itself but whatever value you need in order to compute or request the vector.

A lazy index doesn't update automatically when documents are created or updated. (But it does update when a document is deleted or purged, to remove the corresponding index entry.) If you query it after creating it, you won't get any results.

You (the app) decide when to update the index, probably either in response to a collection change notification or when about to query the index.

  1. Call the new C4Collection.getIndex method to get a C4Index reference to the index. (It's best to keep this reference around instead of releasing it, otherwise it has to recompile some queries every time.)
  2. Call beginUpdate on the index. This will return a C4IndexUpdater reference, or will return NULL if the index is already up-to-date.
  3. The updater has a list of FLValues which are the values of the expression you gave when you created the index. For each of these, create a vector somehow and call the setVector method to at the corresponding index.
  4. When done, call the updater's finish method. This must be done in a transaction as it updates the database.
  5. Finally release the updater; it cannot be reused.

snej avatar Feb 13 '24 01:02 snej

I've implemented skipping documents now.

snej avatar Mar 20 '24 16:03 snej

Huh, the Jenkins build failed on Android:

couchbase-lite-core/LiteCore/Query/LazyIndex.cc:133:15: error: call to member function 'bind' is ambiguous
         _ins->bind(1, rowid);

Looks like that issue with int64_t being long on Linux instead of long long ... but why didn't the GitHub CI Ubuntu build fail with the same error?

snej avatar Mar 21 '24 16:03 snej