cloudquery
cloudquery copied to clipboard
feat: Add option to use Appender API
https://github.com/cloudquery/cloudquery/pull/16668 continued (with updated library support) which had to be scrapped before (PR) due to version incompatibility.
This time I've added this as an option, if it works (maps are unsupported in the API*, structs are represented as strings due to language limitations), it could be significantly faster than encoding and decoding from parquet.
- Maps are apparently supported as of https://github.com/marcboeker/go-duckdb/pull/237 but they haven't been implemented in this PR.
It's a way of loading bulk data into duckdb via API (api docs) I'm not sure if it's really faster than writing to + loading from parquet, but in theory (since all the intermediary steps involved are in memory) it should be.
I will add a test later.
Going to close this as stale, at least until we get reports on performance issues with DuckDB. Please re-open if you think we still need this with benchmarks of how it improves on our current approach