compress-from-stdin (or pipe) and compress-to-stdout?
Would be nice if blpk could compress-from stdin or decompress-to stdout, so it could be a drop-in alternative to gzip/bzip2/etc.
I tried the workaround of supplying a process-substitution pipe, as in..
blpk c <(zcat bigfile.gz) bigfile.blp
...but that simply created a 140-byte file that decompressed to the 0-length file. (So it'd be nice if that worked, too.)
Or does blosc[pack] require random access to entire files to operate?
No, no random access is needed and it should theoretically work fine in streaming mode.
You can try to implement the following to support compressing/decompressing from/to stdin/stdout
PlainSTDINSource
CompressedSTDINSource
PlainSTDOUTSink
CompressedSTDOUTSink
You can use the following files as inspiration
https://github.com/Blosc/bloscpack/blob/master/bloscpack/abstract_io.py https://github.com/Blosc/bloscpack/blob/master/bloscpack/file_io.py
The idea is that using a PlainSTDINSource and a CompressedSTDOUTSink in combination with pack:
https://github.com/Blosc/bloscpack/blob/master/bloscpack/abstract_io.py#L109
You might be able to reuse:
https://github.com/Blosc/bloscpack/blob/master/bloscpack/abstract_io.py#L109
and
https://github.com/Blosc/bloscpack/blob/master/bloscpack/file_io.py#L374
Since stdin and stdout should behave like a file pointer and support read and write methods.
Have fun hacking!
Thanks for the quick reply & detailed pointers! When I get a chance to test the potential benefits of blosc on my data I may poke around at this... probably starting with why the process-substitution pipe doesn't work, as it'd often be an adequate workaround.