libsql icon indicating copy to clipboard operation
libsql copied to clipboard

Vector search support

Open penberg opened this issue 1 year ago • 2 comments

This pull request adds initial support for vector search in libSQL.

Highlights

  • Vector column type for storing vectors in tables.
  • Vector index creation that is automatically updated on table updates.
  • Exact vector search with metadata filtering using plain SQL.
  • Approximate vector search using the new vector_top_k() function that is backed by DiskANN-based vector index.

Usage

Creating a table with a vector column:

CREATE TABLE movies (
  title TEXT, 
  year INT, 
  embedding FLOAT32(3)
);

Inserting vector data:

INSERT INTO movies (title, year, embedding) 
VALUES 
  (
    'Napoleon', 
    2023, 
    vector('[1,2,3]')
  ), 
  (
    'Black Hawk Down', 
    2001, 
    vector('[10,11,12]')
  ), 
  (
    'Gladiator', 
    2000, 
    vector('[7,8,9]')
  ), 
  (
    'Blade Runner', 
    1982, 
    vector('[4,5,6]')
  );

Creating an index on vector column:

CREATE INDEX movies_idx USING vector_cosine_ops ON movies (embedding);

Finding top-k similar rows (exact):

SELECT title, year FROM movies ORDER BY vector_distance_cos(embedding, '[3,1,2]') LIMIT 3;

Finding top-k similar rows (approximate):

SELECT 
  title, 
  year 
FROM 
  vector_top_k('movies_idx', '[4,5,6]', 3) 
JOIN
  movies 
ON 
  movies.rowid = id;

Limitations

  • Index key is always rowid, primary keys not supported.
  • CREATE INDEX does not index rows that already exist in the base table.
  • Vector index uses 32-bit per vector element, which causes redundant I/O and space amplification.

penberg avatar May 20 '24 10:05 penberg

Any status for this PR? We're hardly waiting for vector support 🙏🏻 Thanks for the work btw!

pax-k avatar Jun 11 '24 13:06 pax-k

@pax-k I am actively working with folks to iron out some bugs and then get this merged.

penberg avatar Jun 12 '24 10:06 penberg

This work has been merged as part of the following PRs:

https://github.com/tursodatabase/libsql/pull/1531

https://github.com/tursodatabase/libsql/pull/1551

https://github.com/tursodatabase/libsql/pull/1557

https://github.com/tursodatabase/libsql/pull/1560

Therefore, closing this.

penberg avatar Jul 24 '24 08:07 penberg