percona-xtrabackup icon indicating copy to clipboard operation
percona-xtrabackup copied to clipboard

Implement PXB-2669 - ZSTD compression

Open altmannmarcelo opened this issue 2 years ago • 2 comments

https://jira.percona.com/browse/PXB-2669

Compress -

ZSTD compression works by basically just consuming the read buffer and passing it to ZSTD_compressStream2. We utilize ZSTD build-in Thread Pooling. We now enable support for multi thread to static ZSTD. Adjusted required files to compress and decompress via xtrabackup. Modified copy_file to handle empty files as ZSTD will refuse to run if we provide an empty file. It requires that the file at least have a ZSTD like header with empty content.

Decompress -

Compression using xtrabackup --compress=zstd & xtrabackup --decompress works fine as we do everything in one go (read the entire file and pass it to zstd client).

Xbstream is a bit different. There is no alignment between the buffers we deal with in compression and stream(Read buffer, compression and xbstream chunk).

Let's take an example of hypothetical File 1. We will have 3 layers of data:

  1. Raw file We read raw file in read_buffer_size chunks (Default 10Mb):
+------------------------------------------------------------+
| File 1                                                     |
+------------------------------------------------------------+
                                ^
                                |
                              10 Mb
  1. Compressed data Read buffer is passed into ds_compress_zstd and gets compressed into one or multiple Frames:
+------------------+--------------- +
|  File 1 Frame 1  | File 1 Frame 2 |
+------------------+----------------+

Each frame has a header(H) and 1 or more blocks(Bn) with variant length:

|  File 1 Frame 1  | File 1 Frame 2 |
+------------------+--------------- +
|H| B1   |B2| B3   |H| B1 |    B2   |
+------------------+----------------+
  1. XBStream data xbstream will then write data into multiple chunks of read_buffer_size
+------------------$---+-------------+
| XBS chunk 1      $   | XBS chunk 2 |
+------------------$---+-------------+
                   $
             End of File1 Frame 1

At xbstream, we receive data as chunks (represented by item #3). XBStream chunk 1 will have the complete data of File 1 Frame 1 (F1F1) and part of the data of File 1 Frame 2 (F1F2), while the remaining data of that frame will only be available at XBStream chunk 2. Thus we need to first parse xbstream data reading it with the ZSTD compression format https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md in order to validate when we should send the data of F1F1 to ZSTD decompress functions. Also we need to take into consideration that part of the buffer from chunk1 needs to be saved and append at the beginning on XBS chunk 2 in order to see the full F1F2 data (reassembly item - 2. Compressed data).

Thus we need to create a ring buffer to parse each xbstream chunk. Ring buffer work as follow:

  • When we receive a new xb chunk we save it as a new buffer in the ring buffer at ds_istream.h.
  • Before checking if we have a complete frame we save current position of ring buffer. We try to validate if our continuous buffer has a complete frame. We then validate the return of ZSTD_findFrameCompressedSize. If we have a complete frame we return ZSTD_OK and the correct frame size (this will be used to decompress the frame at the main function). In case of an error, we validate if the error is due to wrong src size meaning this is a partial buffer, in which case we return ZSTD_INCOMPLETE or ZSTD_ERROR on all other cases. We always restore the ring buffer position back to the saved position, as the main funtion is the one responsible to advance the buffer by the correct frame length bytes.

Decompression: We utilize ZSTD buffer structs. There is the In(read) and out(write) buffer. Those buffers receive a pointer to the real buffer where data will be read/write to and maintain a track of current position we have processed up to. On each chunk of data decompressed, we write it out to disk and if checksum flag is set on frame header, we update the checksum computed so far. We keep consuming the read buffer until we have decompressed all the data. As a final step to consider the frame complete, we validate the checksum if necessary.

Testing Extended the necessary test to also cover ZSTD compression. Adjusted page_compression.sh as the last step we were testing sparseness on backup files, while the intention is to test sparseness on datadir files to ensure --copy-back did not loose it.

altmannmarcelo avatar Jun 15 '22 19:06 altmannmarcelo

https://pxb.cd.percona.com/job/percona-xtrabackup-8.0-test-param/207/

altmannmarcelo avatar Jun 15 '22 19:06 altmannmarcelo

@marce, Seeing some compilation error and warning on macos. Please fix.

th/libprotobuf.3.19.4.dylib /Users/rahulmalik/MySQL/src/x8/bld/runtime_output_directory/xtrabackup Undefined symbols for architecture x86_64: "_ZSTD_createThreadPool", referenced from: compress_init(char const*) in ds_compress_zstd.cc.o "_ZSTD_freeThreadPool", referenced from: compress_deinit(ds_ctxt*) in ds_compress_zstd.cc.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) ninja: build stopped: subcommand failed.

[12/264] Building CXX object storage/in...es/xbstream.dir/ds_decompress_zstd.cc.o /Users/rahulmalik/MySQL/src/x8/storage/innobase/xtrabackup/src/ds_decompress_zstd.cc:185:44: warning: variable 'start_frame_cur' may be uninitialized when used here [-Wconditional-uninitialized] frame_size = stream.length_from_to(start_frame_cur, start_frame_pos, ^~~~~~~~~~~~~~~ /Users/rahulmalik/MySQL/src/x8/storage/innobase/xtrabackup/src/ds_decompress_zstd.cc:143:25: note: initialize the variable 'start_frame_cur' to silence this warning size_t start_frame_cur, end_frame_cur, start_frame_pos, end_frame_pos = 0; ^ = 0 /Users/rahulmalik/MySQL/src/x8/storage/innobase/xtrabackup/src/ds_decompress_zstd.cc:185:61: warning: variable 'start_frame_pos' may be uninitialized when used here [-Wconditional-uninitialized] frame_size = stream.length_from_to(start_frame_cur, start_frame_pos, ^~~~~~~~~~~~~~~ /Users/rahulmalik/MySQL/src/x8/storage/innobase/xtrabackup/src/ds_decompress_zstd.cc:143:57: note: initialize the variable 'start_frame_pos' to silence this warning size_t start_frame_cur, end_frame_cur, start_frame_pos, end_frame_pos = 0; ^

rahulmalik87 avatar Jul 14 '22 05:07 rahulmalik87