manifold
manifold copied to clipboard
Unrecognized type should be Object not String
Describe the bug
To Reproduce install pgvector for postgres https://github.com/pgvector/pgvector create table
create table if not exists document_embeddings
(
id serial8 primary key,
embedding vector(2) not null,
content text not null,
metadata jsonb not null,
created_at timestamp not null default now()
);
run code
My.DocumentEmbeddingsTest.builder("[0.2,0.3]", "abc", "{}").build();
or
My.addSqlChange(ctx -> {
"[.sql/] insert into document_embeddings_test(embedding, content, metadata) values (?,?,?::jsonb)".execute(
ctx, "[0.2,0.3]", "abc", "{}");
});
Exception in thread "main" org.postgresql.util.PSQLException: ERROR: column "embedding" is of type vector but expression is of type character varying
Hint: You will need to rewrite or cast the expression.
Expected behavior
There is a workaround like jsonb
, We pass String
and cast it to vector
My.addSqlChange(ctx -> {
"[.sql/] insert into document_embeddings_test(embedding, content, metadata) values (?::vector,?,?::jsonb)".execute(
ctx, "[0.2,0.3]", "abc", "{}");
});
or better without casting (not possible for now)
My.addSqlChange(ctx -> {
"[.sql/] insert into document_embeddings_test(embedding, content, metadata) values (?,?,?::jsonb)".execute(
ctx, new float[]{0.1F, 0.2F}, "abc", "{}");
});
use raw jdbc
pstmt.setObject(1, new float[]{0.1F, 0.2F});
// https://github.com/pgvector/pgvector-java
PGvector pGvector = new PGvector(new float[]{0.1F, 0.2F});
pstmt.setObject(1, pGvector);
The Problem Converting a String to a vector works for small vectors, but often we use vectors of size 1024 or larger. Thus, we need to convert a float [1024] to a String, pass it to the database, and then the database must convert the String back to a vector, which is time-consuming.
Manually maintaining mappings from non-JDBC types to Java types can be an endless task. It's better to allow users to implement these mappings. What I propose is that an unrecognized type should default to Object, not String, as converting a String to another type can be a performance issue. Additionally, there should be a way for users to map these types themselves.
Desktop (please complete the following information):
- OS Type & Version:
- Java/JDK version: 21
- IDE version (IntelliJ IDEA or Android Studio): IntelliJ IDEA
- Manifold version: 2024.1.16
- Manifold IntelliJ plugin version: 2024.1.4
Additional context
I understand that ValueAccessor
is for mapping, and I want to implement it myself.
https://github.com/manifold-systems/manifold/blob/e94f5ae8cdb87c9d901da2352a97a13a81c9f9d0/manifold-deps-parent/manifold-sql-rt/src/main/java/manifold/sql/rt/impl/accessors/OtherValueAccessor.java#L37-L40
it invoke getColumnClassName
to get java type, postgresql jdbc implement this method, is this the problem of postgresql jdbc?
https://github.com/pgjdbc/pgjdbc/blob/450488c142fdc368cab54e8257407603acc18c4f/pgjdbc/src/main/java/org/postgresql/jdbc/PgResultSetMetaData.java#L440
for some reason I can't step into getColumnClassName
while debugging.
Stack trace Please include a stack trace if applicable