lz4-rs
lz4-rs copied to clipboard
Compress on 32bit BE, decompress on x86_64 LE
Hi!
I'm using lz4-rs in my compressed_log
1 crate to compress log on the fly to a remote server over persistent websocket connection. I noticed that on MIPS 32-bit big endian clients it produces data stream that x86_64 LE server can't decode. This is very interesting as after investigating upstream liblz4 code it seems like they really care about endianness and the binary stream is supposed to be portable.
Custom { kind: Other, error: LZ4Error("ERROR_frameType_unknown") }
(Which appears to be this line in liblz4 https://github.com/lz4/lz4/blob/591b6621244e77d8293230085ae65db0cfa98d88/lib/lz4frame.c#L1078-L1079)
I'm thinking that it may be the bindings problem, or invalid flags for big endian. After some experiments I've got some data decoded, but at some point it still fails with ERROR_frameType_unknown
. So my question is - is there anything missing on lz4-rs side, or some issue? I wonder whats your opinion on this matter.
I've also checked how Linux distros build liblz4 for big endian architectures but they seem to just use upstream Makefile (the process is very similiar to build.rs
in lz4-rs)
Please refer to my fork https://github.com/mpapierski/lz4-rs/blob/big-endian/lz4-sys/build.rs#L20-L29 for lines that appeared to improve a bit the decoding situation.
Hi,
can you create a small example that
- compresses static data on both your mips machine and an amd64 machine
- decompresses the data correctly only if it was created on the amd64 machine
This should not happen and I'ld like to look into that issue. I would need to be able to reproduce it in a qemu VM, though.
Hi @jheyens
Sorry for late reply. Here you can find the relevant files gathered (input.txt, data_x86.bin, data_mips.bin, and data_mips_fork.be): https://www.dropbox.com/s/kobwxdo03t8oxfk/data_lz4.tar.gz?dl=0
To reproduce the problem please do following:
cargo install cross
(you probably need docker in your system)
Set up compressed_log
and compressed_log_sink
x86:
# terminal 1
git clone https://github.com/mpapierski/compressed_log_sink
cd compressed_log_sink
git checkout endianness
cargo run -- --bind=0.0.0.0:8080 --output=data.bin
# terminal 2
git clone https://github.com/mpapierski/compressed_log
cd compressed_log
git checkout endianness
cargo run --example simple
mips:
# terminal 1
cd compressed_log
# vim examples/simple.rs
# change IP address from 127.0.0.1 to any of network interfaces i.e. 192.168.x.x
cross run --target mips-unknown-linux-gnu --example simple
When running simple
any line entered on stdin will be transmitted as compressed buffer to the sink, and sink will save the compressed data buffer on specified file i.e. data.bin
.
After dumping compressed data gathered from x86 and mips runs to files data_x86.bin
and data_mips.bin
you can verify its output:
$ lz4cat data_x86.bin
Lorem ipsum...
$ lz4cat data_mips.bin
Error 44 : Unrecognized header : file cannot be decoded
Any help or idea is appreciated. It might seem like some obscure compile flags are missing from lz4-sys, but unfortunately I couldn't figure it out.
@mpapierski - given the response on lz4/lz4#703, do you still suspect an endian-ness issue?