data-diff CSV insertions for database for better benchmarking

CSV insertions for database for better benchmarking

Open sirupsen opened this issue 3 years ago • 0 comments

Currently if you run the benchmarking scripts (see README and https://github.com/datafold/data-diff/pull/135) it's very slow against the cloud databases. It would be better to use CSV imports for the cloud databases (redshift, bigquery, oracle, snowflake) by doing something similar to dev/_bq_import_csv.py in _insert_to_table.

It will work today for 100M rows, but it'll be very slow...

Jun 30 '22 19:06 sirupsen

data-diff data-diff copied to clipboard

CSV insertions for database for better benchmarking

data-diff
data-diff copied to clipboard