data-diff icon indicating copy to clipboard operation
data-diff copied to clipboard

Add benchmarking scripts for % of changed rows

Open sirupsen opened this issue 3 years ago • 0 comments

Currently the benchmarking introduced in https://github.com/datafold/data-diff/pull/135 checks two tables that are equal.

We'd love to add some tests where we delete/change an increasing % of rows, starting at just 1 row, to see how it behaves.

Especially it would be useful to know at what threshold just pulling down all the rows would be faster

We should generate some graphs to show this relationship of checksum/download crossing in performance. That would be gold for the README

sirupsen avatar Jun 30 '22 20:06 sirupsen