Chad Netzer

Results 15 comments of Chad Netzer

Picking up from the MaxSize discussion started in issue #11, I'd say a big question is how should the MaxSize limit be honored? If it's meant as a safety guard...

Currently, the amount of read bytes can be greater than MaxSize, and there is nothing to stop the comparison of bytes being performed on the excess read() data beyond MaxSize,...

Yeah, it might be another item for better documenting, since users currently have the option of setting MaxSize equal to a low value (even 1), and having that low value...

@akaihola That's a good descriptive expansion of the projects I alluded to. At least for my fork, I can mention that I've essentially rewritten (or heavily modified) the current algorithm...

@akaihola I might be doing some sporadic traveling for a couple weeks, which may coincide w/ John V returning, so in the short term we can maybe wait and see...

I added a test that exercises this bug by setting up the initial link relationship in reverse order. Ie. it adds a test which links dir2/name1.ext to dir1/link before running...

Reopening this PR and rebasing fix on current master.

The MAX_HASHES is likely an artifact of being ported over from C++, which probably used a static hash table (C++ had no standard hash table for the longest time). It's...

I have a branch that replaces the current directory walking with os.walk(), which uses os.scandir() on recent Python 3 (which can greatly improve performance of os.walk()). The benefit (for now)...

Hmmm, with some refactoring of main() and hardlink_identical_files(), it appears possible to avoid using os.walk() while supporting it's directory pre-culling semantics (with the exclude option). This should also allow directly...