data-diff
data-diff copied to clipboard
Add benchmarking scripts for % of changed rows
Currently the benchmarking introduced in https://github.com/datafold/data-diff/pull/135 checks two tables that are equal.
We'd love to add some tests where we delete/change an increasing % of rows, starting at just 1 row, to see how it behaves.
Especially it would be useful to know at what threshold just pulling down all the rows would be faster
We should generate some graphs to show this relationship of checksum/download crossing in performance. That would be gold for the README