git-lfs-ipfs icon indicating copy to clipboard operation
git-lfs-ipfs copied to clipboard

Allow using buzhash chunker and switch to it by default

Open RubenKelevra opened this issue 1 year ago • 1 comments

Currently, the fixed 256 kb chunker is hard-coded.

Switching to the buzhash chunker gives a large advantage if files are modified and data is moved around – like in VM images or tar archives. It also works pretty good on zstd archives when using --rsyncable.

As the data which is moved around can still be matched, a deduplication of different versions is possible.

As that's a prime target for using Git LFS, I think it's worth changing the default here to buzhash.

There are no disadvantages, except a bit CPU time spend on doing the rolling hash.

RubenKelevra avatar Mar 26 '23 14:03 RubenKelevra

Would you mind making a PR for this?

sameer avatar Mar 28 '23 04:03 sameer