include-flate icon indicating copy to clipboard operation
include-flate copied to clipboard

implement `deflate_if` conditional compression

Open kkent030315 opened this issue 1 year ago • 6 comments

This PR introduces conditional compression deflate_if as @SOF3 pointed out in previous PR.

flate!(pub static DATA1: [u8] from "assets/random.dat" with zstd if always);
flate!(pub static DATA2: [u8] from "assets/random.dat" with deflate if less_than_original);
flate!(pub static DATA3: [u8] from "assets/random.dat" with deflate if compression_ratio_more_than 10%);

flate!(pub static DATA4: [u8] from "assets/random.dat" if always);
flate!(pub static DATA5: [u8] from "assets/random.dat" if less_than_original);
flate!(pub static DATA6: [u8] from "assets/random.dat" if compression_ratio_more_than 10%);

flate!(pub static DATA7: str from "assets/chinese.txt" with zstd if always);
flate!(pub static DATA8: str from "assets/chinese.txt" with deflate if less_than_original);
flate!(pub static DATA9: str from "assets/chinese.txt" with deflate if compression_ratio_more_than 10%);

flate!(pub static DATA10: str from "assets/chinese.txt" if always);
flate!(pub static DATA11: str from "assets/chinese.txt" if less_than_original);
flate!(pub static DATA12: str from "assets/chinese.txt" if compression_ratio_more_than 10%);

Features

  • Implement deflate_if! proc-macro in include-flate-codegen.

The deflate_if! macro is completely isolated from the deflate_file! (or deflate_utf8_file!). This is by design. Because we do not actually want to (and should not) calculate/evaluate whether or not the file is actually compressed, at runtime. deflate_if! is a proc-macro evaluates and returns boolean at compile-time in the Lazy::new. This constant boolean will trigger the compiler optimization and the compiler will remove the unreachable code, so it allows us to implement this feature without adding any efforts on the runtime code itself.

Indeed, even if the deflate_if! evaluated to false and entire decompression code is removed by compiler, once_cell::Lazy and its related runtime codes will remain but it's not considerable to performance critical. We may want to make it use pure (not with Lazy::new) include_bytes! or include_str! if the compression should not be proceeded. However, to make this feature true, it requires huge refactor of core design of this crate itself. I decided this is not worth doing compared to the what we will achieve in this PR.

Bug Fixes

  • Changes in #25 didn't actually present in flate! macro other than str types.
    • I forgot to pass the macro_rules! parameters into proc-macro implementation, but it wasn't raise any errors since the parameter is entiely optional. I've added test for this to ensure this never happen.

Tests

  • Added tests for deflate_if! in tests/deflate-if.rs.
  • Added tests for selective compression methods (as pointed out in Bug Fixes) in tests/with-compress.rs.
  • Added tests for syntax check in tests/syntax.rs.

Misc

  • Added example project in examples/flate.rs. Especially this was useful for testing the actual binary with decompilers, for me, but should also be useful for people seek to use this crate.

kkent030315 avatar Dec 26 '23 21:12 kkent030315

I've tested if always conditional compression and with xxx selective custom compression methods against compiled binary with decompilers in windows MSVC environment. Everything works fine as expected.

  • [x] with deflate if always should compress with deflate and with only minimal deflate dependency in binary.
  • [x] with zstd if always should compress with zstd and with only minimal zstd dependency in binary.
  • [x] if less_than 10 with very small original file should never be compressed and should never add any runtime codes other than once_cell::Lazy.

kkent030315 avatar Dec 26 '23 22:12 kkent030315

Also, I would suggest squash & merge when merging PRs to avoid adding unmeaningful and verbose commits.

kkent030315 avatar Dec 26 '23 22:12 kkent030315

Would it be a bit ambiguous to write if less_than 10? It is not immediately ambiguous what we are comparing - raw buffer size, compressed buffer size, compression ratio/percentage, 1 - compression ratio/percentage, or what?

SOF3 avatar Dec 27 '23 03:12 SOF3

Would it be a bit ambiguous to write if less_than 10? It is not immediately ambiguous what we are comparing - raw buffer size, compressed buffer size, compression ratio/percentage, 1 - compression ratio/percentage, or what?

Yes, that is what I thought as well. if compression_ratio_more_than 10% sounds amazing as readability. However, assigning %: e.g., if xxx 10% makes syn::LitInt completely broken. We may add custom parse logic there, but the problem is that we may not be able to take advantage of the various support benefits of syn::LitInt types as proc-macro. It is not worth. Or, something else:

  • if compression_ratio_more_than 10
  • if compression_ratio_more_than 10 %
  • if compression_ratio_more_than 10 percent

kkent030315 avatar Dec 27 '23 11:12 kkent030315

Well, a combination of LitInt and Token![%] allowed if compression_ratio_more_than 10% style. @SOF3 How does look like to you?

kkent030315 avatar Dec 27 '23 19:12 kkent030315

originally I wanted to suggest changing more_than to >, but then I realized this is a slippery slope that would prompt for other features like && and || and grouping, at which point we would be implementing a DSL parser. So I suppose it's good enough rn.

SOF3 avatar Dec 28 '23 02:12 SOF3