duplicacy icon indicating copy to clipboard operation
duplicacy copied to clipboard

Option to disable compression of chunks

Open robbat2 opened this issue 6 years ago • 10 comments

One of the sets of content I need to back up is already maximally compressed with XZ, and it makes no sense to try further compressing the chunks with LZ4.

The snapshot data should record the compression format (if any) of the chunks, and permit compression to be entirely optional. This also provides future-proofing for the next great compression breakthrough, and re-compressing existing backups.

robbat2 avatar Sep 03 '17 02:09 robbat2

In duplicacy_chunk.go it looks like the preferences file takes a "compression-level" value. sort of goes like... -1 default zlib compression 0 no compression 9 best zlib 100 LZ4

But I haven't tried it.

Also in the docs it says that the default compression level is -1, but might be 100(LZ4)

niknah avatar Sep 03 '17 23:09 niknah

Prior to version 1.2 you can set the compression level (using the standard zlib numbers 0-9 or -1) when initializing the storage. However, after version 1.2 I decided to switch to LZ4 for compression and blake2 for hash (instead of SHA256), mostly for performance. Therefore, a somewhat arbitrary level of 100 is used to indicate the use of both LZ4 and blake2. And I natively believed that LZ4 is so much faster that there would be no need for other options therefore the compression level option was removed.

Obviously I was wrong and the compression level option should be added back to the init command. The good news is, it is super easy to introduce new compression algorithms (for instance it was just a few lines of code to support LZ4).

Please suggest the compression algorithms that you think should be supported (besides the no compression option).

gilbertchen avatar Sep 05 '17 03:09 gilbertchen

LZ4 is fine for the cases that I do want compression, but I can see people that might want something like Snappy for bounded time in compression.

robbat2 avatar Sep 05 '17 16:09 robbat2

I'd like a high compression one. Like lzma, xz. When we back up to these cloud services, they will charge us $ per month to keep things there. This will add up to a lot after a few years. And some cloud storage services also have high costs for downloading your data.

Thanks

niknah avatar Sep 05 '17 20:09 niknah

+1 for control of compression. I'm using a raspberry pi and an external drive at an offsite location with particularly fast upload to seed my home backups, as pushing them through my home connection from scratch would take about 18 months. A lot of it is already compressed or encoded in one way or another, and I'm backing up to Backblaze B2, which is ultra-cheap. I'd rather have the time/throughput performance than save a few megs here and there with compression.

fenixnet-net avatar Feb 11 '18 08:02 fenixnet-net

To dev: zstd --long seems to work better in terms of speed/compression, a possible golden mean?

To cloud junkies: free plan usually covers the preservation of essential bits, the rest is a reluctance to sort.

sergeevabc avatar Feb 22 '18 17:02 sergeevabc

zstd support would be great to have.

Ralith avatar Jul 21 '18 17:07 Ralith

@gilbertchen any updates on the plans here?

Ralith avatar Mar 21 '19 23:03 Ralith

any modern compression algo is smart enough to switch to store only any non-compressable stream automatically.

lz4 used in this project does it here:

https://github.com/bkaradzic/go-lz4/blob/7224d8d8f27ef618c0a95f1ae69dbb0488abc33a/writer.go#L138

there is no golang native port of zstd - seems like a poor idea to link the C library

sedlund avatar Mar 22 '19 03:03 sedlund

It seems there is now a native go implementation of zstd: https://github.com/klauspost/compress/tree/master/zstd

morris-t avatar May 07 '20 14:05 morris-t