find-duplicates icon indicating copy to clipboard operation
find-duplicates copied to clipboard

Test hashes of parts of large files before hashing entire file

Open twpayne opened this issue 11 months ago • 0 comments

For large files (say over 1MB in size), we can test a few parts of the file to quickly detect non-duplicates without having to read the entire file. A possible set of parts to hash should be the first 4KB, the middle 4KB-aligned 4KB, and the last 4KB-aligned 4KB. It might be sufficient to hash the middle page.

twpayne avatar Mar 08 '24 13:03 twpayne