bottomless: add xz compression option
Empirical testing shows, that gzip achieves mere x2 compression ratio even with very simple and repeatable data patterns. Since compression is very important for optimizing our egress traffic and throughput in general, .xz algorithm is hereby implemented as well. Ran with the same data set, it achieved ~x50 compression ratio, which is orders of magnitude better than gzip, at the cost of elevated CPU usage.
Note: with more algos implemented, we should also consider adding code that detects which compression methods was used when restoring a snapshot, to allow restoring from a gzip file, but continue new snapshots with xz. Currently, setting the compression methods via the env var assumes that both restore and backup use the same algorithm.
TODO: I still need to go over the code and check if there are no more hardcoded assumptions about using gzip for backups.
env var, LIBSQL_BOTTOMLESS_COMPRESSION=xz. But before we go ahead with this, I think I need to add code that detects the previous compression scheme on restore. Without that, it will be impossible to restore from a gz, but use xz for all new backups.
I'm getting corrupted .xz files produced with this crate in "Best" compression level. Let me try the default one, but that's off. The file compressed with the crate didn't properly unpack with xz -d shell command, which is suspicious.
(yep, regular compression level works, and looks only ~10% worse than Best)
There's one more place where compression isn't correctly autodetected - in loading main db snapshots. I'l add the code
k, done
Transplanted to the new repo: https://github.com/tursodatabase/libsql/pull/468