quarkus-langchain4j icon indicating copy to clipboard operation
quarkus-langchain4j copied to clipboard

Implement embedding filtering where appropriate

Open jmartisk opened this issue 1 year ago • 3 comments

LangChain4j 0.28 has a new metadata filter API for embedding stores. We should implement that for embedding stores where it is appropriate - for some embedding stores we don't just take the upstream implementation, we have our own instead.

Tracking:

  • [ ] Chroma
  • [ ] Infinispan
  • [x] Milvus - we use the upstream impl, and filtering is implemented there
  • [x] PgVector
  • [ ] Pinecone
  • [ ] Redis

jmartisk avatar Mar 12 '24 08:03 jmartisk

I can contribute on the PgVector part. Any advise to start ? Here is the example i'm thinking to start : https://github.com/langchain4j/langchain4j-examples/blob/main/rag-examples/src/main/java/_06_Metadata_Filtering.java#L100

humcqc avatar Mar 22 '24 13:03 humcqc

You'll need to override the EmbeddingStore.search method in PgVectorEmbeddingStore in a way that it doesn't ignore the passed filter, but instead transforms the filter into a suitable WHERE SQL clause and uses that clause when retrieving embeddings from the database...

jmartisk avatar Mar 22 '24 13:03 jmartisk

First try -> https://github.com/quarkiverse/quarkus-langchain4j/pull/410

humcqc avatar Mar 26 '24 14:03 humcqc