langchain4j
langchain4j copied to clipboard
[FEATURE] Add advanced RAG with Azure AI Search
As I added Azure AI Search vector search support in #530 , and that advanced RAG was merged in #538 , my goal is now to add hybrid search and semantic reranking to LangChain4J.
In the JavaScript version at https://github.com/langchain-ai/langchainjs/blob/main/libs/langchain-community/src/vectorstores/azure_aisearch.ts they basically have the 3 search models in 3 functions next to each other:
-
similaritySearchVectorWithScore
is the current vector search we already have -
hybridSearchVectorWithScore
is vector search + text search -
semanticHybridSearchVectorWithScore
is hybrid search + reranking
Question 1: as in those 2 last cases, the parallel searches and then the reranking are done inside Azure AI Search, how should this be implemented in LangChain4J? As far as I understand, this would bypass the current "advanced RAG" model.
Question 2: is it OK to implement this like in JavaScript? What I don't like is that the 2 new search functions won't be part of the EmbeddingStore
interface. Or should I inherit from AzureAiSearchEmbeddingStore
to create 2 new implementations?
Update: I'm currently doing an implementation similar to the JavaScript one, where you have a AzureAISearchQueryType
parameter in the AzureAiSearchEmbeddingStore
that allows to choose which one of the 3 search types you want to use.
Update 2: I totally missed that for hybrid search you also need the content! Otherwise it wouldn’t be hybrid 🤣 So this can’t work with the current interface. I’ll have an (imperfect) first draft PR today.