langchainjs
langchainjs copied to clipboard
[Milvus] Store array and JSON metadata fields directly
Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain.js documentation with the integrated search.
- [X] I used the GitHub search to find a similar question and didn't find it.
- [X] I am sure that this is a bug in LangChain.js rather than my code.
- [X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
Example Code
// Based on https://js.langchain.com/docs/integrations/vectorstores/milvus/
import { Milvus } from "@langchain/community/vectorstores/milvus";
import { OpenAIEmbeddings } from "@langchain/openai";
import { Document } from "langchain/document";
const docs: Document[] = [
new Document({
pageContent: "This is a test document.",
metadata: {
source: "test.txt",
foo: {
bar: "baz",
},
qux: [1, 2, 3],
},
})
]
const vectorStore = await Milvus.fromDocuments(docs, new OpenAIEmbeddings(), {
collectionName: "foobar",
});
const response = await vectorStore.similaritySearch("test", 2,
// This won't work...
"array_contains(qux, 1)",
// Only this will
"qux like '%1%'",
);
Error Message and Stack Trace (if applicable)
No response
Description
Milvus 2.2.9 and 2.3.2, released in June 2023 and October 2023, added support for JSON and array data types respectively. This enables access to more efficient operators such as json_contains and array_contains. However, LangChain's current implementation uses VarChar for all metadata fields:
https://github.com/langchain-ai/langchainjs/blob/45498632ce2f5d539d84d049bf5b6717f674ac46/libs/langchain-community/src/vectorstores/milvus.ts#L772-L784
Is it possible to offer it as an option to the user, or do some magic version detection through MilvusClient.getVersion?
System Info
Not sure how pnpm info langchain would be useful since it always shows the latest version, but my installed versions are:
@langchain/community 0.2.33
langchain 0.2.20
Windows, node v23.4.0, pnpm v9.15.1
Hi, @rakuzen25. I'm Dosu, and I'm helping the LangChain JS team manage their backlog. I'm marking this issue as stale.
Issue Summary:
- LangChain.js currently uses
VarCharfor all metadata fields. - Milvus versions 2.2.9 and 2.3.2 support JSON and array data types.
- You suggested updating LangChain to utilize these data types for more efficient operations like
json_containsandarray_contains. - There have been no comments or activity on this issue yet.
Next Steps:
- Is this issue still relevant to the latest version of the LangChain JS repository? If so, please comment to keep the discussion open.
- If there is no further activity, this issue will be automatically closed in 7 days.
Thank you for your understanding and contribution!
Still relevant
@jacoblee93, the user @rakuzen25 has indicated that this issue is still relevant. Could you please assist them with the update to utilize JSON and array data types in LangChain.js?