data-diff icon indicating copy to clipboard operation
data-diff copied to clipboard

Add support for DuckDB

Open extrobe opened this issue 1 year ago • 2 comments

DuckDB is an in-process database. You typically create it as a session, then discard it once you're done (though not the only way to use it)

https://duckdb.org

It's awesome for a few reasons that apply to data-diff. Namely, you can direct-query raw csv/txt/parquet files as though they were tables. (eg select posting_date, count(*) as r_count from '/Users/me/data.csv' group by posting_date ) We use this ability to load PROD v UAT files from our system to compare output. Being able to pass this across to data-diff would be incredible.

Whilst just being able to reference csv files in data-diff might be another option, doing this via duckDB would allow you to perform some basic transformations on the way; such as renaming fields, selecting a reduced range etc

extrobe avatar Jul 25 '22 07:07 extrobe

I have actually started working on a duckdb driver not so long ago, might have something ready next week, but the second part of this

Whilst just being able to reference csv files in data-diff might be another option, doing this via duckDB would allow you to perform some basic transformations on the way; such as renaming fields, selecting a reduced range etc

might deserve a separate issue as it could be generalized for all drivers, no?

danthelion avatar Jul 28 '22 18:07 danthelion

I have actually started working on a duckdb driver not so long ago, might have something ready next week, but the second part of this

Nice! Would love to give it a go when you have something (though should point out I'm a data-diff newbie, so not across every aspect of it)

might deserve a separate issue as it could be generalized for all drivers, no?

I absolutely agree... though I think an aspect of this is captured in https://github.com/datafold/data-diff/issues/79

extrobe avatar Jul 30 '22 11:07 extrobe

DuckDB is now supported! It's already available in master, and will be included in the upcoming release.

erezsh avatar Nov 16 '22 13:11 erezsh

Awesome! Look forward to trying it out!

extrobe avatar Nov 19 '22 04:11 extrobe