zstd-rs icon indicating copy to clipboard operation
zstd-rs copied to clipboard

Further optimizations

Open KillingSpark opened this issue 3 years ago • 0 comments

I need a place to put down some ideas for further optimizing this crate:

Decoder

  1. [ ] We only need to call reserve once for each block of sequences. We can calculate how many bytes will be added to the decode buffer by a list of sequences. This might save some re-allocations.
  2. [ ] The way the zstd_streaming binary works is not optimal. It should just use the drain_to_writer() functions instead of reading into an intermediary buffer. That's why we have these functions.
  3. [x] Read https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-far-too-many-ways-part-1/ and https://fgiesen.wordpress.com/2018/02/20/reading-bits-in-far-too-many-ways-part-2/ again carefully and optimize the bitreaders further
  4. [ ] The ReversedBitreader performance can be enhanced quite a bit by being less useful in the generic case. Just returning wrong values for requests of >56 bits eliminates the need for error handling on calls to the get_bits_(triple) started in #58
  5. [X] The RingBuffer::extend_from_within does a lot of small memcpy calls. These can be sped up a lot by not caring about precise copying of values behind the range we want to copy. Copying a/multiple u128 (where possible) speeds this up by a lot.

Encoder

The main thing taking time in the encoder is the match finding algorithm

  1. [ ] Different matcher algorithms
  2. [ ] Faster hashing for the hashtable based matcher algorithm

KillingSpark avatar May 27 '22 15:05 KillingSpark