k-NN icon indicating copy to clipboard operation
k-NN copied to clipboard

Provide possibility to unset value for vector field(reset)

Open Hronom opened this issue 3 years ago • 4 comments

We need a possibility to unset vector for the document, unfortunately, unlike other types: we cannot set null, since we get error:

                "error": {
                    "type": "mapper_parsing_exception",
                    "reason": "failed to parse field [my_vector2] of type [knn_vector] in document with id '10'. Preview of field's value: 'null'",
                    "caused_by": {
                        "type": "illegal_argument_exception",
                        "reason": "Vector dimension mismatch. Expected: 4, Given: 0"
                    }
                }

Hronom avatar Apr 15 '21 18:04 Hronom

Thanks for pointing this out. We will consider this as a feature request and prioritize.

vamshin avatar Apr 21 '21 23:04 vamshin

I've opened this git to ask for the same feature. lol

Just to add more context: I wan't to free disk usage / memory pressure eliminating just the embedding field and setting the status of the document as False.

Do we have any other way to achieve this until this feature isn't implemented?

Thanks!

marcoaleixo avatar Apr 23 '21 00:04 marcoaleixo

Hi @marcoaleixo @Hronom , Like @vamshin mentioned, we will work on allowing setting null value to the knn field vector. In the meantime, have you tried removing the field from the document using update api?

Example: Get document 4

curl "localhost:9200/myindex/_doc/4?pretty"
{
  "_index" : "myindex",
  "_type" : "_doc",
  "_id" : "4",
  "_version" : 3,
  "_seq_no" : 7,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "my_dense_vector" : [
      10,
      10
    ],
    "color" : "BLUE"
  }
}

Here, i am removing the my_dense_vector field from a single document using update api.

curl -X POST "localhost:9200/myindex/_update/4?pretty" -H 'Content-Type: application/json' -d'
{
  "script" : "ctx._source.remove(\"my_dense_vector\")"
}'

Note: You can use update by query to remove the field by checking whether the field exists or not first.

When i do get on doc 4 after remove, i don't see the doc value.

curl "localhost:9200/myindex/_doc/4?pretty"
{
  "_index" : "myindex",
  "_type" : "_doc",
  "_id" : "4",
  "_version" : 2,
  "_seq_no" : 6,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "color" : "BLUE"
  }
}

Please let us know if this helps till we enable setting null value.

VijayanB avatar May 07 '21 20:05 VijayanB

Thanks, waiting for the fix.

Unfortunately your proposal not fits our workflow, since we not use script based updates. But I believe it will work for someone else.

Hronom avatar Jun 18 '21 14:06 Hronom