czkawka
czkawka copied to clipboard
Hardlinks are considered duplicate.
Hello, I have many files duplicated as Hardlinks on my windows PC, and Czkawka considered them as duplicated even though they were already non-duplicate due to them being hardlinks.
I ran the analysis using Hash+Blake3 method, and it returned a LOT of alleged duplicated files that were all HardLinks.
¿How can I make the software ignore hardlinks?
Currently ignoring hardlinks are implemented only for Unix based systems like Linux and macOS.
If anyone want to implement this feature, this is code responsible for it https://github.com/qarmin/czkawka/blob/2af71023b582bf853ab18e433ee46eb450e5d61a/czkawka_core/src/duplicate.rs#L1306-L1324
While I don't have a PR to submit (as I have 0 experience with Rust), I did reach somewhat of a solution. However, it requires the win32api, which can be used but requires the Rust for Windows dependency. The method revolves around the usage of GetFileInformationByHandle, which returns a structure named BY_HANDLE_FILE_INFORMATION and then check the nFileIndexHigh, nFileIndexLow and dwVolumeSerialNumber to see if they are the same. Hope this somewhat helps. Reference
The file index information can be obtained by nightly api os::windows::fs::MetadataExt::file_index
, returns an u64
, and this value is not garnered to be unique on some filesystems.
Please refer to https://github.com/rust-lang/rust/issues/63010.
While I don't have a PR to submit (as I have 0 experience with Rust), I did reach somewhat of a solution. However, it requires the win32api, which can be used but requires the Rust for Windows dependency. The method revolves around the usage of GetFileInformationByHandle, which returns a structure named BY_HANDLE_FILE_INFORMATION and then check the nFileIndexHigh, nFileIndexLow and dwVolumeSerialNumber to see if they are the same. Hope this somewhat helps.
This is important feature!
as easy solution you can use fsutil hardlink list "C:\123.txt"
\Program Files\456.txt
\789.txt