bloscpack icon indicating copy to clipboard operation
bloscpack copied to clipboard

compress-from-stdin (or pipe) and compress-to-stdout?

Open gojomo opened this issue 10 years ago • 2 comments

Would be nice if blpk could compress-from stdin or decompress-to stdout, so it could be a drop-in alternative to gzip/bzip2/etc.

I tried the workaround of supplying a process-substitution pipe, as in..

  blpk c <(zcat bigfile.gz) bigfile.blp

...but that simply created a 140-byte file that decompressed to the 0-length file. (So it'd be nice if that worked, too.)

Or does blosc[pack] require random access to entire files to operate?

gojomo avatar Jun 18 '15 07:06 gojomo

No, no random access is needed and it should theoretically work fine in streaming mode.

You can try to implement the following to support compressing/decompressing from/to stdin/stdout

PlainSTDINSource
CompressedSTDINSource
PlainSTDOUTSink
CompressedSTDOUTSink

You can use the following files as inspiration

https://github.com/Blosc/bloscpack/blob/master/bloscpack/abstract_io.py https://github.com/Blosc/bloscpack/blob/master/bloscpack/file_io.py

The idea is that using a PlainSTDINSource and a CompressedSTDOUTSink in combination with pack:

https://github.com/Blosc/bloscpack/blob/master/bloscpack/abstract_io.py#L109

You might be able to reuse:

https://github.com/Blosc/bloscpack/blob/master/bloscpack/abstract_io.py#L109

and

https://github.com/Blosc/bloscpack/blob/master/bloscpack/file_io.py#L374

Since stdin and stdout should behave like a file pointer and support read and write methods.

Have fun hacking!

esc avatar Jun 18 '15 07:06 esc

Thanks for the quick reply & detailed pointers! When I get a chance to test the potential benefits of blosc on my data I may poke around at this... probably starting with why the process-substitution pipe doesn't work, as it'd often be an adequate workaround.

gojomo avatar Jun 18 '15 21:06 gojomo