fdupes icon indicating copy to clipboard operation
fdupes copied to clipboard

Add support for btrfs deduplication

Open yath opened this issue 9 years ago • 7 comments

Thie adds an option -D (or --dedupe) to fdupes which issues the BTRFS_IOC_FILE_EXTENT_SAME ioctl on files that it detects as duplicates, causing btrfs to deduplicate the data on disk.

This build option needs to be turned on separately in the Makefile by setting the HAVE_BTRFS_IOCTL_H preprocessor macro to not break builds missing this header.

Also fixes a memory leak in escapefilename().

yath avatar Dec 21 '15 15:12 yath

The btrfs-enabled duperemove utility has support for consuming fdupes duplicate lists and deduplicating them. Because of this, adding the functionality directly to fdupes itself seems unnecessary.

SLOC Directory SLOC-by-Language (Sorted) 7569 duperemove ansic=7307,sh=262 1373 fdupes ansic=1373

There’s vim, why would anyone need another editor?

Note also that fdupes does a byte-for-byte data check which will be repeated by the btrfs driver anyway, slowing the process down.

Yeah, one could probably skip the byte-for-byte check in that case.

(My fork of fdupes has a semi-documented -Q option to skip byte-for-byte checking which alleviates the redundancy in this use case, but it is unlikely such an option will be added to fdupes itself due to the inherent danger of it.)

You are welcome to merge my patch to your fork. Once it’s landed in Debian I’ll use it.

yath avatar Dec 21 '15 15:12 yath

Oh, I would also suggest changing -D to -B since a simple typo can result in -d (delete!) instead of -D and if they happen to use the -N with it, it'll just delete half their stuff instead of dedupe.

Good point, thanks. I’ve amended this commit and it also skips verifying the file contents now.

Could you try it out and open a separate issue report on my fork if it doesn't work as expected?

Yep, works. I think you could just rip out the couple of lines starting at https://git.kernel.org/cgit/linux/kernel/git/kdave/btrfs-progs.git/tree/ioctl.h#n344 (IANAL though).

yath avatar Dec 21 '15 18:12 yath

If you decide to ask Debian (or any other distro) to include your version of fdupes, I do hope you'll give the executable a different name than "fdupes".

On Mon, Dec 21, 2015, 11:46 AM Jody Bruchon [email protected] wrote:

Vim is the only editor I use ;-) As for the line counts, I'd say that's not automatically a good reason to avoid dupreremove though the fact that it requires sqlite3 and glib2 to work is a more compelling argument for inclusion directly into fdupes. My fork is highly unlikely to end up in Debian, though, and I don't know how to poke them to include it anyway...

— Reply to this email directly or view it on GitHub https://github.com/adrianlopezroche/fdupes/pull/52#issuecomment-166335244 .

adrianlopezroche avatar Dec 22 '15 01:12 adrianlopezroche

Hello Sebastian. I am reluctant to include system-specific code in mainline fdupes, but thank you for contributing.

On Mon, Dec 21, 2015, 11:09 AM Sebastian Schmidt [email protected] wrote:

Thie adds an option -D (or --dedupe) to fdupes which issues the BTRFS_IOC_FILE_EXTENT_SAME ioctl on files that it detects as duplicates, causing btrfs to deduplicate the data on disk.

This build option needs to be turned on separately in the Makefile by setting the HAVE_BTRFS_IOCTL_H preprocessor macro to not break builds missing this header.

Also fixes a memory leak in escapefilename().

You can view, comment on, or merge this pull request online at:

https://github.com/adrianlopezroche/fdupes/pull/52 Commit Summary

  • Add support for btrfs deduplication

File Changes

  • M Makefile https://github.com/adrianlopezroche/fdupes/pull/52/files#diff-0 (7)
  • M fdupes.1 https://github.com/adrianlopezroche/fdupes/pull/52/files#diff-1 (5)
  • M fdupes.c https://github.com/adrianlopezroche/fdupes/pull/52/files#diff-2 (208)

Patch Links:

  • https://github.com/adrianlopezroche/fdupes/pull/52.patch
  • https://github.com/adrianlopezroche/fdupes/pull/52.diff

— Reply to this email directly or view it on GitHub https://github.com/adrianlopezroche/fdupes/pull/52.

adrianlopezroche avatar Dec 22 '15 01:12 adrianlopezroche

with my debian maintainer hat on, I can say that 'fdupes' in debian is tracking https://github.com/adrianlopezroche/fdupes/ and it will keep doing it for the foreseeable future. but I have to admit the development here is a bit on hold, and there are a lot of forks out there which might have interesting features to add to fdupes. @adrianlopezroche do you plan to resume fdupes development? that'd be great!

sandrotosi avatar Dec 22 '15 01:12 sandrotosi

Jody Bruchon: Thanks.

On Tue, Dec 22, 2015 at 1:08 AM Jody Bruchon [email protected] wrote:

@adrianlopezroche https://github.com/adrianlopezroche I renamed my fork in the code and documentation to set it apart as a completely different program. If someone decides to add it to a distribution as a package, it will live on its own and not interfere with any installed fdupes. I hope this addresses your previous concern sufficiently. See jbruchon/fdupes-jody@13f6c51 https://github.com/jbruchon/fdupes-jody/commit/13f6c51917256ca30a491e81231d899784096720 for more info.

— Reply to this email directly or view it on GitHub https://github.com/adrianlopezroche/fdupes/pull/52#issuecomment-166507218 .

adrianlopezroche avatar Dec 22 '15 17:12 adrianlopezroche

Sandro Tosi: I haven't abandoned fdupes. I am much more careful about adding new features (don't want fdupes to grow too complex), but there's still some features on which I intend to continue working, and am open to bugfixes.

On Mon, Dec 21, 2015 at 9:18 PM Sandro Tosi [email protected] wrote:

with my debian maintainer hat on, I can say that 'fdupes' in debian is tracking https://github.com/adrianlopezroche/fdupes/ and it will keep doing it for the foreseeable future. but I have to admit the development here is a bit on hold, and there are a lot of forks out there which might have interesting features to add to fdupes. @adrianlopezroche https://github.com/adrianlopezroche do you plan to resume fdupes development? that'd be great!

— Reply to this email directly or view it on GitHub https://github.com/adrianlopezroche/fdupes/pull/52#issuecomment-166472004 .

adrianlopezroche avatar Dec 22 '15 17:12 adrianlopezroche