rsync icon indicating copy to clipboard operation
rsync copied to clipboard

Implement reader support for CalculateDifferences (& various improvements)

Open Freeaqingme opened this issue 5 years ago • 1 comments

  • Add license file
  • Implement reader support for CalculateDifferences
  • Allow to use alternate hashing mechanisms (like xxhash)

My suggestion would be to make this branch the new default (and/or merge it with master). Some additional notes:

  • I added more test-data. Because I didn't want 25+ MiB of test data in my own monorepo I chose to compress it instead.
  • It turned out that CalculateDifferences() didn't support readers yet. That was actually the interesting part :) The implementation I came up with allows to use a reader without having to load the entire reader's contents into memory. The reader implementation is a little slower (~20%) than the original. There's some room for micro optimizations, but given how the original simply read the entire file into memory and operated on that single []byte slice I don't think that can be matched using a reader.
  • Though the original paper refers md4, I think there are other non-cryptographic hashes that are just as good (or better) for the purpose of rsync, that are faster. Personally, I use it with xxhash, but opted to not make it the default so that we don't have to rely on external libraries.

If you have any requests or suggestions please let me know. Happy to address any!

Edit: Please don't merge it yet. Had the previous version implemented in my own application, but would like to first implement this version to be sure it doesn't just work in the test files, but also in an existing application. Works :)

Freeaqingme avatar Aug 06 '19 20:08 Freeaqingme

Thanks for the detailed description, Dolf. I'll review this in the next few days.

juli4n avatar Aug 11 '19 17:08 juli4n