Chronicle-Bytes icon indicating copy to clipboard operation
Chronicle-Bytes copied to clipboard

Compress/Decompress a Bytes<ByteBuffer>

Open JohannesLichtenberger opened this issue 3 years ago • 1 comments

How do I compress/decompress a Bytes<ByteBuffer> instance? For instance I'd like to use Snappy :-)

JohannesLichtenberger avatar Jul 08 '22 20:07 JohannesLichtenberger

The tricky thing might be, that I want to append the data afterwards to an append-only file prefixed with the length of the data to store. Thus, first of all I'm serializing all kinds of data records, then I want to compress the database page fragment and maybe encrypt in the future and afterwards I'm syncing the bytes to the file.

Currently the old implementation using ByteArrayOutputStreams is horribly slow: https://github.com/sirixdb/sirix/blob/9427e25d1b20781f203d17512bce83ee9f6a4381/bundles/sirix-core/src/main/java/org/sirix/io/filechannel/FileChannelWriter.java#L128

And also especially this method: https://github.com/sirixdb/sirix/blob/9427e25d1b20781f203d17512bce83ee9f6a4381/bundles/sirix-core/src/main/java/org/sirix/page/UnorderedKeyValuePage.java#L489

Needs at least as much time to serialize the records to the byte array as for the actual write to the file channel.

Locally I've already changed all DataInput and DataOutput interfaces/implementations and used Bytes<ByteBuffer> instead. Hope I'll be able to get better performance as the architecture itself (single writer without any locks) seems right. MongoDB on my notebook however only needs around 3 - 3,5 minutes to import a 3,8 Gb JSON file. SirixDB needs 9 - 10 minutes. Plus I'm using a simple test case in Sirix without any client/server communication, thus once I'm not messing up the implementation it should even be faster IMHO as it also uses a simple trie to with ideas from hash array mapped tries and ART to find records by dense, ascending numbers (instead of UUIDs for instance and it's only traversing a few arrays more or less, which are referenced through in-memory references once loaded from the flash drive).

JohannesLichtenberger avatar Jul 08 '22 22:07 JohannesLichtenberger

Thank you for your issue - however, this question is consultancy rather than an issue with chronicle-bytes. If you wish us to provide you with consultancy, please contact [email protected] and we would be more than happy to assist.

RobAustin avatar Sep 12 '22 10:09 RobAustin