electric icon indicating copy to clipboard operation
electric copied to clipboard

POC: pgvector support

Open alco opened this issue 1 year ago • 0 comments

Limitations of this POC:

  • The serevr translates the type vector(N) to TEXT(N) when building SQLite migrations from the Postgres schema.

  • It doesn't validate dimensions for incoming values. So if the PG type is vector(3) but the client sends a vector of different dimension, this will result in a failed write to PG.

  • Vectors are only supported in the direct_writes mode.

To see a working example in action, navigate to examples/web-wa-sqlite, apply the migration and generate the client. You'll notice that the generated Zod schema for the issue table is missing the embeddings field. I think it's a consequence of the fact that Prisma does not support the vector type and generates the Unsupported("vector(768)") Prisma type for it. We can work around this by migrating off Prisma introspection and generating the Prisma schema ourselves, as shown in https://github.com/electric-sql/electric/pull/872.

Nevertheless, it allows you to create new issues on the client (those will have embeddings set to NULL) and on the server. In the latter case, embeddings get synced to the client database.

In the screenshot below you can see the outcome of the following sequences of actions:

  • Create an issue in the web app by clicking on the Add button.
  • Insert a new row directly into Postgres:
    # select * from issue;
    [ RECORD 1 ]───────────────────────────────────────────────────────────────────
    d          │ b8a5741b-2192-4ab3-a01c-f9152cf71122
    itle       │ foo title
    escription │ ...
    riority    │ 1
    tatus      │ (no status)
    odified    │ Wed Feb 07 2024 17:44:18 GMT+0200 (Eastern European Standard Time)
    reated     │ Wed Feb 07 2024 17:44:18 GMT+0200 (Eastern European Standard Time)
    anbanorder │ 4
    sername    │ ddab6602-25fb-4815-acdc-66dc3d28c44f
    mbeddings  │ ∅
    
    # insert into issue values (
    gen_random_uuid(), 
    'bar title', 
    '...', 
    '2', 
    '(no status)', 
    now(), 
    now(), 
    '2', 
    'placeholder username', 
    (select 
       array_agg(random()::real * (1 - -1) + -1)
     from
       generate_series (1, 768)
    )::vector(768)
    ;
    NSERT 0 1
    

Screenshot from 2024-02-07 17-50-04

The SatRelation messages exchanged between the client and the server currently look as follows:

[proto] send: #SatRelation{for: public.issue, as: 0, cols: [id: TEXT, title: TEXT, description: TEXT, priority: TEXT, status: TEXT, modified: TEXT, created: TEXT, kanbanorder: TEXT, username: TEXT, embeddings: TEXT(768)]} [client.ts:993:33](http://localhost:5173/node_modules/electric-sql/src/satellite/client.ts)

[proto] send: #SatOpLog{ops: [#Begin{lsn: AAAAAQ==, ts: 1707320659007, isMigration: false}, #Insert{for: 0, tags: [], new: ["b8a5741b-2192-4ab3-a01c-f9152cf71122", "foo title", "...", "1", "(no status)", "Wed Feb 07 2024 17:44:18 GMT+0200 (Eastern European Standard Time)", "Wed Feb 07 2024 17:44:18 GMT+0200 (Eastern European Standard Time)", "4", "ddab6602-25fb-4815-acdc-66dc3d28c44f", ∅]}, #Commit{lsn: }]} [client.ts:993:33](http://localhost:5173/node_modules/electric-sql/src/satellite/client.ts)

[proto] recv: #SatRelation{for: public.issue, as: 16891, cols: [id: text PK, title: text, description: text, priority: text, status: text, modified: text, created: text, kanbanorder: text, username: text, embeddings: vector]} [client.ts:852:12](http://localhost:5173/node_modules/electric-sql/src/satellite/client.ts)

[proto] recv: #SatOpLog{ops: [#Begin{lsn: MjY3ODAxMzY=, ts: 1707320659007, isMigration: false}, #Insert{for: 16891, tags: [ff5e935c-a73f-4b97-bc2d-1f2b07b87bcb@1707320659007], new: ["b8a5741b-2192-4ab3-a01c-f9152cf71122", "foo title", "...", "1", "(no status)", "Wed Feb 07 2024 17:44:18 GMT+0200 (Eastern European Standard Time)", "Wed Feb 07 2024 17:44:18 GMT+0200 (Eastern European Standard Time)", "4", "ddab6602-25fb-4815-acdc-66dc3d28c44f", ∅]}, #Commit{lsn: MjY3ODAxMzY=}]}

alco avatar Feb 07 '24 15:02 alco