ibis icon indicating copy to clipboard operation
ibis copied to clipboard

feat: add read_parquet support to MySQL, SQLite and Postgres via DuckDB

Open gforsyth opened this issue 1 year ago • 1 comments

Is your feature request related to a problem?

No response

Describe the solution you'd like

DuckDB has connectors for mysql, sqlite and postgres that allow reading and writing to those databases (some info here: https://duckdb.org/2024/01/26/multi-database-support-in-duckdb.html).

Currently none of those backends support direct registration of parquet files because the backends themselves don't support it. However, we can use DuckDB (when available) to load the parquet file and then insert it into the attached database.

If DuckDB isn't available, we fall back to the existing error message (although maybe include a note about installing duckdb to enable direct registration)

What version of ibis are you running?

BLEEDING EDGE

What backend(s) are you using, if any?

All of them

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

gforsyth avatar Jan 26 '24 21:01 gforsyth

A different option would be to use pyarrow to do this instead.

Pros:

  • Always installed currently.
  • It's also what we use for the default to_parquet implementation

Cons:

  • Almost certainly slower than what duckdb is doing since the path would be pyarrow -> list of tuples -> dbapi driver. This could be remedied once ADBC is stable though.

jcrist avatar Jan 26 '24 22:01 jcrist