miniz_oxide Rustification

In miniz deflation usually consists of one tdefl_init and multiple tdefl_compress. Both calls carry pointers/references to data which is alive for different time, but in C is stored in one struct. So I divided the struct in two: CallbackOxide for one tdefl_compress call, responsible for input buffers and output buffers and the rest in CompressorOxide. But CallbackOut is a enum: it can be a function or a buffer and function is supposed to live as long as CompressorOxide and is given at tdefl_init (or not). So, I need to extract this Option of callback function and place it into CallbackOut which is unwieldy.

I mostly preserved all functions, but divided huge "global-struct" tdefl_compressor from C into sub-structs by meaning, so functions now carry a bundle of these structs. It is better, but now I must carry a list.

There is a mess of error codes, especially in high-level functions: like so.

write in Cursor takes a slice and so sometimes it produces something like this:

while rle.rle_repeat_count != 0 {
    rle.rle_repeat_count -= 1;
    packed_code_sizes.write(&[rle.prev_code_size][..])?;
}

There is an option to make a vec of size rle.rle_repeat_count, but it will be on heap and probably slow.

Hottest function seems to be this one.

Aug 14 '17 18:08 Frommi

From what I can see the callback functionality is not used by flate2 any of the tests or examples. It's only used by miniz_zip. So, unless the plan is to be able to expose the full functionality of miniz to a C API we could probably avoid the callbacks entirely. (And alternatively have tdefl_init return some error if it's initialized with one or both of the function pointers set to something else than 0, though it's only used by one of the examples beside the zip functionality anyhow.)

EDIT: Ah, it's used by tdefl_compress_mem_to_output, which is used internally by tdefl_compress_mem_to_heap/mem as well. So it's needed if we want to expose that externally as a C function (none of this is used by flate2). For rust and internal use it could probably be replaced by a writer or something. tdefl_compress_mem_to_heap/mem use the callback for tdefl_output_buffer, which is basically a vec: https://github.com/Frommi/miniz_oxide/blob/master/miniz/miniz_tdef.c#L1367

Aug 14 '17 21:08 oyvindln

That got me thinking that we really don't need rusty API, because flate2 will be better at it anyway. So, the weird manipulation of callbacks at least isn't exposed outwards. And all the unsafe code can be purged from new-methods and placed in extern C functions.

Callback functionality isn't that hard and important enough for C API to implement it.

There is currently a bug that I don't update in_buf_size (in C API this pointer is used both to communicate length of a buffer on call and return length of consumed data) before calling callback function. It theoretically can use that information, but none of the tests cover it.

Also, how is that?

Aug 15 '17 12:08 Frommi

That got me thinking that we really don't need rusty API, because flate2 will be better at it anyway.

Yeah we don't need the high level writer stuff as that is provided by flate2, though having something that flate2 could use directly so it can avoid going through an unsafe C layer would be useful I think. (Like versions of the functions flate2 use that take slices and references and using rust primitive types where feasible etc.)

Also, how is that?

Seems okay to me.

Aug 15 '17 14:08 oyvindln