lzma-rs icon indicating copy to clipboard operation
lzma-rs copied to clipboard

Add option to tune how the buffer size is allocated.

Open gendx opened this issue 4 years ago • 1 comments

As mentioned in https://github.com/gendx/lzma-rs/pull/22#issuecomment-595374615, there is a trade-off between memory usage and speed.

I suggest adding an option to control the original size of the LZ buffer:

  • DictSize would initialize a buffer of the full dictionary's size right away. That's the behavior before #22.
  • InitialSize(value: usize) would instead initialize it to min(value, dict_size). The behavior after #22 is InitialSize(0).

It remains to be seen whether there would be a performance regression between the code before #22 and using the DictSize option.

gendx avatar Mar 05 '20 21:03 gendx

I think this makes sense. It would be interesting to also see how other libraries handles this. After all, it might be that the files we are parsing are actually malformed and should have reported a smaller dictionary size.

I would be very happy with using DictSize by default if we can override it by setting InitialSize in options.

Also, it would be interesting to know why the allocation is using alloc_zeroed. Perhaps it is possible and safe to use a non-zero-initialized buffer? Our problem was that alloc_zeroed is slow in WebAssembly because of a call to memset.

dragly avatar Mar 09 '20 10:03 dragly