laravel-scout-typesense-driver icon indicating copy to clipboard operation
laravel-scout-typesense-driver copied to clipboard

Model update resets document auto-generated embeddings

Open ingria opened this issue 2 years ago • 2 comments

Description

In Typesense version 0.25 it's possible to define an embedding field. But since this field is auto-generated and is not stored locally, any update to the model will cause the embeddings field to reset.

Steps to reproduce

Schema:

    {
      "name": "embedding",
      "type": "float[]",
      "facet": false,
      "optional": false,
      "index": true,
      "sort": false,
      "infix": false,
      "locale": "",
      "embed": {
        "from": [
          "some_field"
        ],
        "model_config": {
          "model_name": "ts/paraphrase-multilingual-mpnet-base-v2"
        }
      },
      "num_dim": 768
    }

After I import the model, Typesense takes some time to generate embeddings. After that process, all documents will have the embedding field with array of 768 floats.

Then, If I call searchable() method on the model, the embedding field becomes empty.

Expected Behavior

embedding field should ether be updated if embed.from fields are changed, or be left unchanged.

Actual Behavior

the embedding field becomes empty

Metadata

Typesense Version: 0.25.0

OS: Ubuntu 20.04

ingria avatar Aug 17 '23 22:08 ingria

Also, subsequent calls of artisan scout:import command deletes the embedding field on all of the models in Typesense collection.

ingria avatar Aug 17 '23 22:08 ingria