langchainjs
langchainjs copied to clipboard
feat: Added filtering ability to supabase
Added filtering ability to supabase. With the following function defined in your supabase postgres db:
CREATE FUNCTION match_documents_with_filters (
query_embedding vector(1536),
match_count int,
filter jsonb DEFAULT '{}'
) RETURNS TABLE (
id bigint,
content text,
metadata jsonb,
similarity float
)
LANGUAGE plpgsql
AS $$
#variable_conflict use_column
BEGIN
RETURN QUERY
SELECT
id,
content,
metadata,
1 - (documents.embedding <=> query_embedding) as similarity
FROM
documents
WHERE
-- Check for user_id filter
jsonb_exists(metadata, 'user_id') AND metadata->>'user_id' = filter->>'user_id'
-- Add additional filters here using the same pattern
ORDER BY
documents.embedding <=> query_embedding
LIMIT
match_count;
END;
$$;
You should be able to filter the documents based on fields in your document metadata:
export const query = async (query) => {
const chat = new ChatOpenAI({
modelName: "gpt-3.5-turbo",
apiKey: OPEN_AI_API_KEY,
});
const vectorStore = await SupabaseVectorStore.fromExistingIndex(
new OpenAIEmbeddings(),
{
client,
tableName: "documents",
queryName: "match_documents_with_filters",
}
);
const chain = ConversationalRetrievalQAChain.fromLLM(
chat,
vectorStore.asRetriever(null, { user_id: "2" }),
{ returnSourceDocuments: true }
);
const res = await chain.call({
question: query,
chat_history: [],
});
console.log(res.text);
console.log({ docs: res.sourceDocuments });
return res;
};
The latest updates on your projects. Learn more about Vercel for Git ↗︎
Name | Status | Preview | Updated (UTC) |
---|---|---|---|
langchainjs-docs | ✅ Ready (Inspect) | Visit Preview | May 5, 2023 4:07pm |
@mishkinf @nfcampos Here's an alternate implementation, let me know what you think?
In addVectors, we take in additional fields to write to in the table. So, we get to take advantage of Supabase db schema constraints. In similaritySearchVectorWithScore these extra fields an be passed in to the rpc call. The rpc function will have to be defined by the user with the additional fields in the args, and it will use them accordingly in the pgsql function.
If they want to have different querying methods completely, they can have different matcher functions with different names, each with the fields being passed in as args. This way they get the benefits of the structured schemas of supabase, and get the flexibility to implement lookup in an efficient manner per match function.
@ShantanuNair no strong opinions on my end, I merely need the functionality. One thing to note is that you could start with this implementation which essentially creates a RPC friendly way to pass filters to a supabase function, and then later on build in support to create new supabase database columns (with whatever schema constraints and functionality). The arguments of filters to the pgsql function is agnostic of the underlying data you are filtering the documents by. So in this case I am filtering data that is in metadata, but you could take this approach and use the same interface to filter the documents based on whatever db fields you have.
@mishkinf Makes sense, soon I will create a PR with the functionality I suggested as well, you can take a look if you have the time
I think that would be great! @ShantanuNair
Looking forward to this!
The filter param could also be generalized to accept any metadata field with WHERE metadata @> filter
as follows:
CREATE FUNCTION match_documents_with_filters (
query_embedding vector(1536),
match_count int,
filter jsonb DEFAULT '{}'
) RETURNS TABLE (
id bigint,
content text,
metadata jsonb,
similarity float
)
LANGUAGE plpgsql
AS $$
#variable_conflict use_column
BEGIN
RETURN QUERY
SELECT
id,
content,
metadata,
1 - (documents.embedding <=> query_embedding) as similarity
FROM
documents
WHERE
metadata @> filter
ORDER BY
documents.embedding <=> query_embedding
LIMIT
match_count;
END;
$$;
with usage:
...
const chain = ConversationalRetrievalQAChain.fromLLM(
chat,
vectorStore.asRetriever(null, {
user_id: "2",
repo: "langchain" // or any metadata field
}),
{ returnSourceDocuments: true }
);
...
etc.
Hope this helps!
Honestly this is so fricken cool! I hope this happens!
Looking forward this filtering feature in Supabase!
@mishkinf PR, Looks good to me!
Let's do this! Thanks for the PR @mishkinf
yes! need this feature!