lzma-rs
lzma-rs copied to clipboard
Support legacy LZMA format with `unpacked_size` 32bit long
Old versions of lzma SDKs (e.g. from 7zip) were using 32bit long field for the unpacked size in the header. Would be awesome to have support for these too. The LZMA SDK 4.05 (from 2004 year) for example handles it this way.
Do you have a reference to the code or documentation, and example files to test this? Otherwise it will be hard to add support for this use case. Feel free to send a pull request as well!
Yes, see how it is done https://github.com/XVilka/ocaml-lzma_7z/blob/master/lzma.ml#L387
One of the cases it was used - very old 7Zip SDK, which was used in the modified CramFS version that used LZMA:
- https://github.com/batterystaples/mkcramfs-lzma
- https://github.com/digiampietro/lzma-uncramfs
Note how it is used 32bit ints for outsize:
- https://github.com/digiampietro/lzma-uncramfs/blob/master/lzma-rg/SRC/7zip/Compress/LZMA_C/LzmaDecode.c
- https://github.com/digiampietro/lzma-uncramfs/blob/master/lzma-rg/SRC/7zip/Compress/LZMA_C/decode.c#L87
Would https://github.com/gendx/lzma-rs/pull/17 (or a variant of it) work for this use case?
Yes, ability to form the header manually is good enough, thanks!
The implementation in https://github.com/gendx/lzma-rs/pull/17 only supports a 64-bit or 0-bit field for the unpacked size though. So I assume it wouldn't work for a 32-bit unpacked size out-of-the-box.
@gendx someone also needs this feature it seems, not only me: https://users.rust-lang.org/t/extract-lzma-file/24793/5
@gendx someone is also needs this feature it seems, not only me: https://users.rust-lang.org/t/extract-lzma-file/24793/5
Thanks for the pointer. I don't have a lot of time for implementing it at the moment, nor files from the old SDK to test with.
But feel free to send a pull request, I'll be happy to take a look!
If the unpacked_size
is known beforehand, #74 should cover most of this use case, except for reading the 13-byte header. You would have to read the header manually then construct the decoder with the params.
Will this do ? `impl LzmaParams { // Other methods omitted for brevity
/// Read LZMA parameters from the LZMA stream header.
pub fn read_header<R>(input: &mut R, options: &Options) -> error::Result<LzmaParams>
where
R: io::BufRead,
{
// Properties
let props = input.read_u8().map_err(error::Error::HeaderTooShort)?;
let mut pb = props as u32;
if pb == 0xFF {
return Err(error::Error::InvalidLzmaProperties);
}
pb = pb % 9;
let mut lp = (props / 9) as u32;
if lp == 0xFF {
return Err(error::Error::InvalidLzmaProperties);
}
lp = lp % 5;
let mut lc = (props / (9 * 5)) as u32;
if lc == 0xFF {
return Err(error::Error::InvalidLzmaProperties);
}
lc = lc % 9;
let properties = LzmaProperties { lc, lp, pb };
// Dictionary size
let mut dict_size = [0u8; 4];
input
.read_exact(&mut dict_size)
.map_err(error::Error::HeaderTooShort)?;
let dict_size = u32::from_le_bytes(dict_size) as usize;
// Unpacked size
let mut unpacked_size = [0u8; 8];
input
.read_exact(&mut unpacked_size[0..4])
.map_err(error::Error::HeaderTooShort)?;
let unpacked_size = u32::from_le_bytes(unpacked_size[0..4]) as usize;
Ok(Self {
properties,
dict_size,
unpacked_size,
})
}
} `
Tests will look as follows ` #[test] fn test_lzma_params_read_header() { // Test reading LZMA params from a valid stream header let mut input = Cursor::new(b"\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"); let options = Options::default(); let params = LzmaParams::read_header(&mut input, &options); assert!(params.is_ok()); let params = params.unwrap(); assert_eq!(params.properties.lc, 0); assert_eq!(params.properties.lp, 0); assert_eq!(params.properties.pb, 1); assert_eq!(params.dict_size, 1); assert_eq!(params.unpacked_size, None);
// Test reading LZMA params from a stream header with unpacked size
let mut input = Cursor::new(b"\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x64");
let options = Options::default();
let params = LzmaParams::read_header(&mut input, &options);
assert!(params.is_ok());
let params = params.unwrap();
assert_eq!(params.properties.lc, 0);
assert_eq!(params.properties.lp, 0);
assert_eq!(params.properties.pb, 1);
assert_eq!(params.dict_size, 1);
assert_eq!(params.unpacked_size, Some(100));
// Test reading LZMA params from a stream header with invalid properties
let mut input = Cursor::new(b"\xFF\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00");
let options = Options::default();
let params = LzmaParams::read_header(&mut input, &options);
assert!(params.is_err());
// Test reading LZMA params from a stream header with too few bytes
let mut input = Cursor::new(b"\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00");
let options = Options::default();
let params = LzmaParams::read_header(&mut input, &options);
assert!(params.is_err());
}
`