pywb icon indicating copy to clipboard operation
pywb copied to clipboard

Allows `multifilewarcwriter` to write non compressed `WARC` files by selective activation.

Open Lisias opened this issue 1 year ago • 0 comments

That's the thing: we have file systems with transparent compression nowadays (and to think this started with Stacker on MS-DOS!), so it makes sense to use uncompressed WARC files on a BTRFS or NTFS with it activated. This commit deactivates the WARCIO gzip support when the filename does not ends with .gz, allowing the user to use these filesystems to reach the compression he wants without having to deal with uncompressing the WARC on use.

for https://github.com/webrecorder/pywb/issues/915

This code is being used for 2 months already on a linux box and btrfs using zstd:15 compression for the WARC files. The penalty on writing is negligible,the readings are perceptively faster and the compression level is way better.

Lisias avatar Aug 19 '24 14:08 Lisias