connector-x
connector-x copied to clipboard
Support JSONB fallback for polars
The problem
When loading postgres JSONB columns to polars, connectorx currently throws
RuntimeError: No conversion rule from JSONB(true) to connectorx::destinations::arrow::typesystem::ArrowTypeSystem.
Related: #121
The solution
For sqlite tables with JSON columns, connectorx simply loads these columns as strs without parsing them into JSON. This is nice fallback and it would be nice to have this fallback for postgresql as well instead of throwing an error.
This would be very helpful in situations where you are trying to load some set of rows that happen to have JSON columns. It would prevent the query from breaking and you could still decode the JSON text in the app if you need it.
Minimal example to reproduce
# create table and insert rows
import psycopg
connect_str = 'dbname={dbname} user={username}'
create_sql = 'CREATE TABLE IF NOT EXISTS example ( id BIGINT, name TEXT, note JSONB )'
insert_sql = 'INSERT INTO example VALUES (%s, %s, %s)'
insert_rows = [
[1, 'first', psycopg.types.json.Jsonb({'version': 2})],
[2, 'second', psycopg.types.json.Jsonb({'version': 3})],
[3, 'third', psycopg.types.json.Jsonb({'version': 2})],
]
with psycopg.connect(connect_str) as conn:
with conn.cursor() as cursor:
cursor.execute(create_sql)
cursor.executemany(insert_sql, insert_rows)
# query data with connectorx
import connectorx
result = connectorx.read_sql(
conn='postgresql://{username}@localhost:5432/{dbname}',
query='SELECT * FROM example',
return_type='polars',
)
>>> RuntimeError: No conversion rule from JSONB(true) to connectorx::destinations::arrow::typesystem::ArrowTypeSystem.
Polars now has a native "Binary" type. It seems that it would be a better fit, as well as more performant to just pass it in to the Binary type
The same happens with the redshift "super" type where usually json data is put into and at the moment gives an error of type not implemented.