drizzle-orm
drizzle-orm copied to clipboard
[FEATURE]: add PgVector support
Support for PGVector, a popular extension for vectors and vector comparisons
AI projects are quite popular right now and often involve working with vector databases. The PgVector extension (https://github.com/pgvector/pgvector) has pg
, sequelize
, and prisma
support, but currently none for drizzle-orm
.
I spent a while trying to apply the custom types example to add PgVector support, but haven't been able to get it working yet. If anyone with more familiarity is able to take a look at it, it would unblock a use case for me, and I suspect many others!
Thank you to the team for your work on drizzle-orm :)
Update: progress! The following type definition works, at least for CRUD operations.
I don't use drizzle-kit so have not tested it with table generation.
Posting here for comments and improvements, and in case useful to others. Happy to submit the type if it would be useful to folks.
import { customType } from "drizzle-orm/pg-core";
export const vector = customType<{
data: number[];
driverData: string;
config: { size: number };
}>({
dataType(config) {
const dt =
!!config && typeof config.size === "number"
? `vector(${config.size})`
: "vector";
return dt;
},
fromDriver(value) {
return JSON.parse(value);
},
toDriver(value) {
return JSON.stringify(value);
},
});
Thanks @ianjdarrow! I added experimental support to pgvector-node (but it's not released yet). I tried using both customType as well as PgColumn/PgColumnBuilder, but wasn't able to get either working with Drizzle Kit. If someone is able to figure it out, please submit a PR.
What is the status on pgvector in drizzle on this day?
Any updates? Thanks
Generated columns should have type vector(dimensions)
, not "vector(dimensions)"
I think it need to fix the drizzle-kit but I cannot access the source :/
So I've removed the double quotes from the vector(1536)
in my migration.sql file, but it still doesn't want to push it to the database. I'm getting the error: type "vector(1536)" does not exist
error. (using drizzle-kit)
I was able to push it when changing it from vector(1536)
to vector
.
Is it a bad idea to store a vector as a vector
instead of as a vector(1536)
?
Until drizzle-kit properly supports non-native pg types, you can patch drizzle-kit yourself to add e.g. vector support.
This assumes you are using pnpm
to manage your dependencies and also that you're running the latest drizzle-kit version.
Path: ./patches/[email protected]
diff --git a/bin.cjs b/bin.cjs
index 371b26746840f9b5da1b63f763dd96cc569e7713..a9c498a6ab9e0e449c4a871a1788d138aaa24f87 100755
--- a/bin.cjs
+++ b/bin.cjs
@@ -15346,7 +15346,7 @@ var init_sqlgenerator = __esm({
if (pgNativeTypes.has(it))
return true;
const toCheck = it.replace(/ /g, "");
- return toCheck.startsWith("varchar(") || toCheck.startsWith("char(") || toCheck.startsWith("numeric(") || toCheck.startsWith("timestamp(") || toCheck.startsWith("intervalyear(") || toCheck.startsWith("intervalmonth(") || toCheck.startsWith("intervalday(") || toCheck.startsWith("intervalhour(") || toCheck.startsWith("intervalminute(") || toCheck.startsWith("intervalsecond(") || toCheck.startsWith("intervalyeartomonth(") || toCheck.startsWith("intervaldaytohour(") || toCheck.startsWith("intervaldaytominute(") || toCheck.startsWith("intervaldaytosecond(") || toCheck.startsWith("intervalhourtominute(") || toCheck.startsWith("intervalhourtosecond(") || toCheck.startsWith("intervalminutetosecond(") || /^(\w+)(\[\d*])+$/.test(it);
+ return toCheck.startsWith("vector(") || toCheck.startsWith("varchar(") || toCheck.startsWith("char(") || toCheck.startsWith("numeric(") || toCheck.startsWith("timestamp(") || toCheck.startsWith("intervalyear(") || toCheck.startsWith("intervalmonth(") || toCheck.startsWith("intervalday(") || toCheck.startsWith("intervalhour(") || toCheck.startsWith("intervalminute(") || toCheck.startsWith("intervalsecond(") || toCheck.startsWith("intervalyeartomonth(") || toCheck.startsWith("intervaldaytohour(") || toCheck.startsWith("intervaldaytominute(") || toCheck.startsWith("intervaldaytosecond(") || toCheck.startsWith("intervalhourtominute(") || toCheck.startsWith("intervalhourtosecond(") || toCheck.startsWith("intervalminutetosecond(") || /^(\w+)(\[\d*])+$/.test(it);
};
Convertor = class {
};
diff --git a/utils.js b/utils.js
index 3322ed988c7752ddc3c46f051db77960b5903bf5..f8c9ba642383b7e469446801509a88b78f2ef613 100644
--- a/utils.js
+++ b/utils.js
@@ -15063,7 +15063,7 @@ var init_sqlgenerator = __esm({
if (pgNativeTypes.has(it))
return true;
const toCheck = it.replace(/ /g, "");
- return toCheck.startsWith("varchar(") || toCheck.startsWith("char(") || toCheck.startsWith("numeric(") || toCheck.startsWith("timestamp(") || toCheck.startsWith("intervalyear(") || toCheck.startsWith("intervalmonth(") || toCheck.startsWith("intervalday(") || toCheck.startsWith("intervalhour(") || toCheck.startsWith("intervalminute(") || toCheck.startsWith("intervalsecond(") || toCheck.startsWith("intervalyeartomonth(") || toCheck.startsWith("intervaldaytohour(") || toCheck.startsWith("intervaldaytominute(") || toCheck.startsWith("intervaldaytosecond(") || toCheck.startsWith("intervalhourtominute(") || toCheck.startsWith("intervalhourtosecond(") || toCheck.startsWith("intervalminutetosecond(") || /^(\w+)(\[\d*])+$/.test(it);
+ return toCheck.startsWith("vector(") || toCheck.startsWith("varchar(") || toCheck.startsWith("char(") || toCheck.startsWith("numeric(") || toCheck.startsWith("timestamp(") || toCheck.startsWith("intervalyear(") || toCheck.startsWith("intervalmonth(") || toCheck.startsWith("intervalday(") || toCheck.startsWith("intervalhour(") || toCheck.startsWith("intervalminute(") || toCheck.startsWith("intervalsecond(") || toCheck.startsWith("intervalyeartomonth(") || toCheck.startsWith("intervaldaytohour(") || toCheck.startsWith("intervaldaytominute(") || toCheck.startsWith("intervaldaytosecond(") || toCheck.startsWith("intervalhourtominute(") || toCheck.startsWith("intervalhourtosecond(") || toCheck.startsWith("intervalminutetosecond(") || /^(\w+)(\[\d*])+$/.test(it);
};
Convertor = class {
};
Then add this to you package.json
"pnpm": {
"patchedDependencies": {
"[email protected]": "patches/[email protected]"
}
}
You can use pnpm patch if you need to fix this for any other version of drizzle-kit.
Until drizzle-kit properly supports non-native pg types, you can patch drizzle-kit yourself to add e.g. vector support.
This assumes you are using
pnpm
to manage your dependencies and also that you're running the latest drizzle-kit version.Path:
./patches/[email protected]
diff --git a/bin.cjs b/bin.cjs index 371b26746840f9b5da1b63f763dd96cc569e7713..a9c498a6ab9e0e449c4a871a1788d138aaa24f87 100755 --- a/bin.cjs +++ b/bin.cjs @@ -15346,7 +15346,7 @@ var init_sqlgenerator = __esm({ if (pgNativeTypes.has(it)) return true; const toCheck = it.replace(/ /g, ""); - return toCheck.startsWith("varchar(") || toCheck.startsWith("char(") || toCheck.startsWith("numeric(") || toCheck.startsWith("timestamp(") || toCheck.startsWith("intervalyear(") || toCheck.startsWith("intervalmonth(") || toCheck.startsWith("intervalday(") || toCheck.startsWith("intervalhour(") || toCheck.startsWith("intervalminute(") || toCheck.startsWith("intervalsecond(") || toCheck.startsWith("intervalyeartomonth(") || toCheck.startsWith("intervaldaytohour(") || toCheck.startsWith("intervaldaytominute(") || toCheck.startsWith("intervaldaytosecond(") || toCheck.startsWith("intervalhourtominute(") || toCheck.startsWith("intervalhourtosecond(") || toCheck.startsWith("intervalminutetosecond(") || /^(\w+)(\[\d*])+$/.test(it); + return toCheck.startsWith("vector(") || toCheck.startsWith("varchar(") || toCheck.startsWith("char(") || toCheck.startsWith("numeric(") || toCheck.startsWith("timestamp(") || toCheck.startsWith("intervalyear(") || toCheck.startsWith("intervalmonth(") || toCheck.startsWith("intervalday(") || toCheck.startsWith("intervalhour(") || toCheck.startsWith("intervalminute(") || toCheck.startsWith("intervalsecond(") || toCheck.startsWith("intervalyeartomonth(") || toCheck.startsWith("intervaldaytohour(") || toCheck.startsWith("intervaldaytominute(") || toCheck.startsWith("intervaldaytosecond(") || toCheck.startsWith("intervalhourtominute(") || toCheck.startsWith("intervalhourtosecond(") || toCheck.startsWith("intervalminutetosecond(") || /^(\w+)(\[\d*])+$/.test(it); }; Convertor = class { }; diff --git a/utils.js b/utils.js index 3322ed988c7752ddc3c46f051db77960b5903bf5..f8c9ba642383b7e469446801509a88b78f2ef613 100644 --- a/utils.js +++ b/utils.js @@ -15063,7 +15063,7 @@ var init_sqlgenerator = __esm({ if (pgNativeTypes.has(it)) return true; const toCheck = it.replace(/ /g, ""); - return toCheck.startsWith("varchar(") || toCheck.startsWith("char(") || toCheck.startsWith("numeric(") || toCheck.startsWith("timestamp(") || toCheck.startsWith("intervalyear(") || toCheck.startsWith("intervalmonth(") || toCheck.startsWith("intervalday(") || toCheck.startsWith("intervalhour(") || toCheck.startsWith("intervalminute(") || toCheck.startsWith("intervalsecond(") || toCheck.startsWith("intervalyeartomonth(") || toCheck.startsWith("intervaldaytohour(") || toCheck.startsWith("intervaldaytominute(") || toCheck.startsWith("intervaldaytosecond(") || toCheck.startsWith("intervalhourtominute(") || toCheck.startsWith("intervalhourtosecond(") || toCheck.startsWith("intervalminutetosecond(") || /^(\w+)(\[\d*])+$/.test(it); + return toCheck.startsWith("vector(") || toCheck.startsWith("varchar(") || toCheck.startsWith("char(") || toCheck.startsWith("numeric(") || toCheck.startsWith("timestamp(") || toCheck.startsWith("intervalyear(") || toCheck.startsWith("intervalmonth(") || toCheck.startsWith("intervalday(") || toCheck.startsWith("intervalhour(") || toCheck.startsWith("intervalminute(") || toCheck.startsWith("intervalsecond(") || toCheck.startsWith("intervalyeartomonth(") || toCheck.startsWith("intervaldaytohour(") || toCheck.startsWith("intervaldaytominute(") || toCheck.startsWith("intervaldaytosecond(") || toCheck.startsWith("intervalhourtominute(") || toCheck.startsWith("intervalhourtosecond(") || toCheck.startsWith("intervalminutetosecond(") || /^(\w+)(\[\d*])+$/.test(it); }; Convertor = class { };
Then add this to you
package.json
"pnpm": { "patchedDependencies": { "[email protected]": "patches/[email protected]" } }
You can use pnpm patch if you need to fix this for any other version of drizzle-kit.
Possible to expand a bit how to apply this for those of us who are not quite as familiar with patch?
Damn, back to Primsa ;)
I'm not seeing nor hearing about any pgvector updates from Drizzle. So yes, Prisma might win this one (for now).
Hi Drizzle ORM Team, any updates one vector support?
fwiw they are working on it, Andrew posted a screenshot on twitter that included showed a pgvector index in drizzle, as well as other vector stuff - so it seems like it's a work-in-progress.