connector-x
connector-x copied to clipboard
PostgreSQL pgvector support
Describe your feature request
Please support pgvector implementations of PostgreSQL so that we can continue leveraging ConnectorX without resorting to another library.
Recently, we had a customer try to leverage pgvector embedding columns within a PostgreSQL database (in Supabase) and encountered the following error:
2025-05-20 11:09:15 PDT.883 [INFO] Returning event stream for 1 tables
2025-05-20 11:09:15 PDT.884 [INFO] 127.0.0.1:58524 - "POST /v1/database/sample-tables HTTP/1.1" 200
2025-05-20 11:09:15 PDT.885 [INFO] Processing table: public.episodic_memories
2025-05-20 11:09:15 PDT.885 [INFO] Using sampling SQL: SELECT * FROM public.episodic_memories ORDER BY RANDOM() LIMIT 50
2025-05-20 11:09:15 PDT.885 [INFO] About to execute SQL for table public.episodic_memories using sample query: SELECT * FROM public.episodic_memories ORDER BY RANDOM() LIMIT 50
2025-05-20 11:09:15 PDT.885 [INFO] Testing sampling query execution directly
thread '<unnamed>' panicked at /Users/runner/work/connector-x/connector-x/connectorx/src/sources/postgres/typesystem.rs:109:22:
not implemented: vector
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2025-05-20 11:09:16 PDT.913 [ERROR] Exception in ASGI application
+ Exception Group Traceback (most recent call last):
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
| result = await app( # type: ignore[func-returns-value]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
| return await self.app(scope, receive, send)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in __call__
| await super().__call__(scope, receive, send)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/applications.py", line 113, in __call__
| await self.middleware_stack(scope, receive, send)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in __call__
| await self.app(scope, receive, _send)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/middleware/cors.py", line 85, in __call__
| await self.app(scope, receive, send)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
| await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
| await app(scope, receive, sender)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/routing.py", line 715, in __call__
| await self.middleware_stack(scope, receive, send)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
| await route.handle(scope, receive, send)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
| await self.app(scope, receive, send)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
| await wrap_app_handling_exceptions(app, request)(scope, receive, send)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
| await app(scope, receive, sender)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/starlette/routing.py", line 74, in app
| await response(scope, receive, send)
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/sse_starlette/sse.py", line 237, in __call__
| async with anyio.create_task_group() as task_group:
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 772, in __aexit__
| raise BaseExceptionGroup(
| BaseExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
+-+---------------- 1 ----------------
| Traceback (most recent call last):
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/sse_starlette/sse.py", line 240, in cancel_on_finish
| await coro()
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/sse_starlette/sse.py", line 159, in _stream_response
| async for data in self.body_iterator:
| File "/Users/kocienda/Mounts/nf/repo/dev/infactory_api/routes/route_helpers.py", line 168, in stream
| async for chunk in fn:
| File "/Users/kocienda/Mounts/nf/repo/dev/infactory_api/connectors/sql_connector.py", line 1130, in generate_data_streams
| test_df = cx.read_sql(modified_connection, sampling_sql, return_type="polars")
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/Users/kocienda/Library/Caches/pypoetry/virtualenvs/infactory-Dh53eh-z-py3.12/lib/python3.12/site-packages/connectorx/__init__.py", line 409, in read_sql
| result = _read_sql(
| ^^^^^^^^^^
| pyo3_runtime.PanicException: not implemented: vector
+------------------------------------
I was able to replicate the issue using a local PostgreSQL container when trying to leverage ConnectorX:
frontend-1 | code: 'UND_ERR_SOCKET',
api-1 | | File "/root/.cache/pypoetry/virtualenvs/infactory-9TtSrW0h-py3.12/lib/python3.12/site-packages/connectorx/__init__.py", line 409, in read_sql
frontend-1 | socket: [Object]
api-1 | | result = _read_sql(
frontend-1 | }
api-1 | | ^^^^^^^^^^
frontend-1 | }
api-1 | | RuntimeError: db error: ERROR: cannot cast type vector to double precision[]
frontend-1 | }
api-1 | |
api-1 | | During handling of the above exception, another exception occurred:
api-1 | |
api-1 | | Traceback (most recent call last):
api-1 | | File "/app/infactory_api/connectors/sql_connector.py", line 1335, in generate_data_streams
api-1 | | await upload_sql_and_sample_data(
api-1 | | File "/app/infactory_api/connectors/sql_connector.py", line 989, in upload_sql_and_sample_data
api-1 | | df = cx.read_sql(modified_connection, simplified_query, return_type="polars")
api-1 | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
api-1 | | File "/root/.cache/pypoetry/virtualenvs/infactory-9TtSrW0h-py3.12/lib/python3.12/site-packages/connectorx/__init__.py", line 409, in read_sql
api-1 | | result = _read_sql(
api-1 | | ^^^^^^^^^^
api-1 | | RuntimeError: db error: ERROR: cannot cast type vector to double precision[]
api-1 | |
api-1 | | During handling of the above exception, another exception occurred:
api-1 | |
api-1 | | Traceback (most recent call last):
api-1 | | File "/root/.cache/pypoetry/virtualenvs/infactory-9TtSrW0h-py3.12/lib/python3.12/site-packages/sse_starlette/sse.py", line 240, in cancel_on_finish
api-1 | | await coro()
api-1 | | File "/root/.cache/pypoetry/virtualenvs/infactory-9TtSrW0h-py3.12/lib/python3.12/site-packages/sse_starlette/sse.py", line 159, in _stream_response
api-1 | | async for data in self.body_iterator:
api-1 | | File "/app/infactory_api/routes/route_helpers.py", line 168, in stream
api-1 | | async for chunk in fn:
api-1 | | File "/app/infactory_api/connectors/sql_connector.py", line 1365, in generate_data_streams
api-1 | | raise HTTPException(
api-1 | | fastapi.exceptions.HTTPException: 500: Error sampling table public.high_dim_vectors: db error: ERROR: cannot cast type vector to double precision[]
api-1 | +------------------------------------
frontend-1 | [Middleware] Checking redirect for path: /api/infactory/v1/datasources/4239b1a3-99b9-439a-a5d2-f66c0e937e28/with_datalines
frontend-1 | [Middleware] Found platform_id: a9fbb528-64b8-4019-833c-268ff7b09f84
frontend-1 | [SERVER] API request: {
frontend-1 | method: 'GET',
frontend-1 | url: 'http://api:8000/v1/datasources/4239b1a3-99b9-439a-a5d2-f66c0e937e28/with_datalines'
frontend-1 | }
api-1 | 2025-05-21 18:33:42 UTC.430 [INFO] HTTP Request: POST http://localhost:38179/ "HTTP/1.1 200 OK"
api-1 | 2025-05-21 18:33:42 UTC.433 [INFO] HTTP Request: POST http://localhost:38179/ "HTTP/1.1 200 OK"
api-1 | 2025-05-21 18:33:42 UTC.434 [INFO] HTTP Request: POST http://localhost:38179/ "HTTP/1.1 200 OK"
api-1 | 2025-05-21 18:33:42 UTC.436 [INFO] HTTP Request: POST http://localhost:38179/ "HTTP/1.1 200 OK"
api-1 | 2025-05-21 18:33:42 UTC.437 [INFO] 172.18.0.3:47612 - "GET /v1/datasources/4239b1a3-99b9-439a-a5d2-f66c0e937e28/with_datalines HTTP/1.1" 200
It would be great if ConnectorX supported pgvector columns, as I now had to include an additional SQL library in our code base (psycopg) simply to support pgvector.
Thanks @holicc for adding support for pgvector! I have release an alpha version: pip install connectorx==0.4.4a1, please feel free to try it out!
@wangxiaoying @holicc can confirm from our end that this is working ✅