Improve compression support
- [ ] Checksum support (<- Great first issue!)
- [ ] Check if/how much building custom FSE tables improves the compression rate and what the performance implications are
- [ ] Check if/how much rebuilding huffman tables for each block helps/hurts performance and ratio
- [ ] Dictionary support
- [ ] Check other matching algorithms and window sizes to implement more compression levels
- [x] Make the
Matchertrait public and allow users to provide their own implementations
This is awesome, I've been swamped with schoolwork so haven't really had time to finish up my compression addition.
Is there anywhere that my help would be valued? Once this is merged into master I can look into restructuring everything to be more organized, or:
- I want to add a lot more to the CI, ala https://github.com/jonhoo/rust-ci-conf
- I can add more documentation
- I want to try experimenting with making the bitreader flip chunks using SIMD, I think it would lead to a real performance speedup
One thing that would be nice to have is actually streaming the input. Currently I am being cheap and just read the source into a Vec and then process it as if it was passed as an parameter. That's obviously just a crutch and needs to change so large sources can be processed with efficient memory usage.
I'd hold off with doc on the encoder part for a bit, it's likely a lot will still change and keeping doc up to date can be tricky. And wrong/outdated doc is worse than no doc in my experience.
Trying out SIMD for the bitreader sounds interesting, especially for the case where three numbers are read in one call. The bitreader is one of the hottest codepaths in the decoding, even small gains can have a huge impact. It also has the benefit of not depending on this branch being merged soon.
The really complicated thin I'm hung up on currently is designing a good algorithm for finding the matches to generate sequences. I learned about efficiently searching strings in university but this is a bit different as the string to search in constantly changes as the data window moves. I think I found something half promising but it's annoying to implement in safe rust. I might put a pin into that an deffer this to later to get this branch into a mergeable state.
Anyways do what you think sounds most fun to you :)
@zleyyij I merged this and did some reorganizing myself. Feedback on that would be greatly appreciated.
I removed all re-exports for now so the common stuff is now hidden in ruzstd::encoding::xxx. These should probably come back to the top level imports? Anyways the doc page is now much cleaner after hiding all the internal modules :)
Awesome, I'll take a look tomorrow :)
On Sat, Dec 14, 2024 at 09:17 Moritz Borcherding @.***> wrote:
@zleyyij https://github.com/zleyyij I merged this and did some reorganizing myself. Feedback on that would be greatly appreciated.
I removed all re-exports for now so the common stuff is now hidden in ruzstd::encoding::xxx. These should probably come back to the top level imports? Anyways the doc page is now much cleaner after hiding all the internal modules :)
— Reply to this email directly, view it on GitHub https://github.com/KillingSpark/zstd-rs/issues/72#issuecomment-2543172166, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASCMLYRXE5ZEYWEO3N5UDYT2FRKYDAVCNFSM6AAAAABPP7CCZKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNBTGE3TEMJWGY . You are receiving this because you were mentioned.Message ID: @.***>
@KillingSpark
Here's my scratchpad:
Externally facing feedback
- I cloned the repo on my machine then ran
cargo doc --open. The homepage feels readable and organized. A few areas of improvement you could consider:- If you use
#![doc = include_str!("README.md")]in your module level documentation forlib.rs, then you can include the entire README in the docs page - If you really wanted to improve docs for the compression side, I really like rustdoc's documentation format, and I would try to use that for every major public facing component https://doc.rust-lang.org/rustdoc/how-to-write-documentation.html#documenting-components
- If you use
Internal implementation feedback
compression/mod.rs
compression/mod.rsis an unusual place to put thematchertrait if I'm understanding what it is correctly. Do you think it would be better suited undermatch_generator.rs?- The
Matchertrait appears to define some domain specific language (a space), but it doesn't ever really define what a space is. Is it a heap alloaction where compressed data is written?
match_generator.rs
- Module level documentation for
match_generator.rswould be helpful, just a sentence or two explaining that it's generating matches for the literals section of a zstd frame or whatever it is - Your implementation for
match_generatoris elegant, I like it
frame_compressor.rs
- I actually made this mistake when I was first writing the compressor stuff, I don't like that
FrameCompressor::compressis basically a massive match chain, I think it'll get worse and worse as more compression support is added. What do you think about having alevelsmodule, and for every compression level, there's a file with a singlefn compress_LEVEL(input: R, output: &mut Vec<u8>). Then the match just has aCompressionLevel::uncompressed => compress_uncompressed(input, &mut output)
bit_writer.rs
bit_writerhas a typo on line 10,outpushould beoutput.- Some of the stuff in
bit_writer.rsis very difficult to understand, and could benefit from comments, eg:
if idx % 8 != 0 {
let bits_in_first_byte = 8 - (idx % 8);
assert!(bits_in_first_byte <= num_bits);
self.output.as_mut()[idx / 8] &= 0xFFu8 >> bits_in_first_byte;
let new_bits = (bits << (8 - bits_in_first_byte)) as u8;
self.output.as_mut()[idx / 8] |= new_bits;
num_bits -= bits_in_first_byte;
bits >>= bits_in_first_byte;
idx += bits_in_first_byte;
}
The function it's under has no docstring, and there are no comments throughout the body of the function, so it's difficult to grok easily.
Overall, great work, this is a very robust job you've done here. If you'd like help with some of these, I can help.
I'm currently working on SIMD optimization of reverse_bitreader, it's headache inducing but I am still making process (I love reading about discrete galois field matrix transforms /s). My current plan is to implement x86_64 (GFNI/AVX512VL) and ARM NEON support, then write a fallback platform agnostic version. Currently, it seems this will require nightly support enabled for this (maybe we lock behind a feature flag, inline some assembly, or use a stopgap crate until that's stabilized?)
Thanks for the feedback! Lots of good stuff :)
I think the Matcher trait needs some work anyways so the MatcherDriver is generic over the Matcher and doesn't itself implement Matcher. This would allow user code to provide their own Matcher implementation if they want to. But with that in mind I think it is in the right place, even if it is only pub(crate) with a single implementation for now.
The SIMD work sounds very cool! I think swapping the bitreader based on a feature flag is fine, if the feature documents that the crate is only supported on nightly with that feature enabled
Just figured I'd update you, currently working on implementing dictionary support. It's far from done yet, but I'll try to be more regular about providing updates.
I found the paper facebook's implementation uses, and am in the process of working through it, hopefully resulting in dictionaries that compete with dictionaries created by the reference implementation.
Is that paper publicly available? I'd be interested in reading that too :)
It is, "Effective Construction of Relative Lempel-Ziv Dictionaries" https://people.eng.unimelb.edu.au/ammoffat/abstracts/lpmw16www.pdf
As noted here: https://github.com/facebook/zstd/blob/dev/lib/dictBuilder/cover.c#L11-L18
It's a dense read, I'm working with a professor on it as a research project.
After that, I want to implement markov chain based dictionary generation, I believe it could generate higher quality dictionaries at the cost of speed.