rs-async-zip
rs-async-zip copied to clipboard
a computed CRC32 value did not match the expected value - when CRC is in data_descriptor
Related to #122 but slightly different
I'm working with v0.0.17
Here's a zipfile that reproes external_modules.zip
The zipfile was generated with https://www.npmjs.com/package/archiver - and seems to unzip fine with other tools.
Using the zipdetails tool, it shows this for the first couple entries.
00000 LOCAL HEADER #1 04034B50
00004 Extract Zip Spec 14 '2.0'
00005 Extract OS 00 'MS-DOS'
00006 General Purpose Flag 0008
[Bits 1-2] 0 'Normal Compression'
[Bit 3] 1 'Streamed'
00008 Compression Method 0008 'Deflated'
0000A Last Mod Time 58DB8572 'Thu Jun 27 16:43:36 2024'
0000E CRC 00000000
00012 Compressed Length 00000000
00016 Uncompressed Length 00000000
0001A Filename Length 001F
0001C Extra Length 0000
0001E Filename 'node_modules/.package-lock.json'
0003D PAYLOAD
008D2 STREAMING DATA HEADER 08074B50
008D6 CRC 528AD14A
008DA Compressed Length 00000895
008DE Uncompressed Length 0000105F
It seems that the data_descriptor includes the CRC.
I did a bit of digging in my repro and found that async-zip is using the 00000000 CRC for comparison rather than the one in the data_descriptor.
https://github.com/Majored/rs-async-zip/blob/main/SPECIFICATION.md#439
Spec seems to indicate that if header.flags.data_descriptor is set, then crc/lengths should be taken from data descriptor rather than local file header.
Like @ldanilek mentioned in #122 - Things parsed in 0.0.9 and now no longer work.
there is a workaround which is to manually bypass crc check.
entry_reader.reader_mut().read_to_end(&mut contents).await?;
instead of
entry_reader
.reader_mut()
.read_to_end_checked(&mut contents)
.await?;
Don't love bypassing the CRC check though for obvious reasons.
It looks like 0.0.9 had code like this
/// Returns true if the computed CRC32 value of all bytes read so far matches the expected value.
pub fn compare_crc(&mut self) -> bool {
let hasher = std::mem::take(&mut self.hasher);
let final_crc = hasher.finalize();
if self.meta.general_purpose_flag.data_descriptor {
self.data_descriptor.expect("Data descriptor was not read").0 == final_crc
} else {
self.entry.crc32() == final_crc
}
}
which on 0.0.17 no longer seems to work.
https://github.com/Majored/rs-async-zip/blob/main/src/base/read/stream.rs#L177
It seems that the CRC is not read out of the data descriptor.
Then, I think you'd want the CRC check to happen in done() - after parsing the data descriptor - rather than in read_to_end_checked, at least for these data descriptor cases.