project-lakechain icon indicating copy to clipboard operation
project-lakechain copied to clipboard

Feature request: Vector Storage does not allow to specify document id indexing logic

Open HQarroum opened this issue 1 year ago • 0 comments

Use case

Today, the vector storage connector uses the document url or the chunk identifier (if the document is a chunk) to provide a document identifier to OpenSearch when indexing the document. This a problem for documents that change often as this can lead to a duplication of modified chunks in the OpenSearch storage.

Solution/User Experience

Provide a way for end-users to define how they want the vector storage connector to index documents (e.g append-only, or a potential removal of previous chunks before insertion).

Alternative solutions

No response

HQarroum avatar Feb 09 '24 12:02 HQarroum