mozjpeg-rust icon indicating copy to clipboard operation
mozjpeg-rust copied to clipboard

Decoding isn't reporting decode errors back to application

Open dave-andersen opened this issue 5 years ago • 6 comments

Apologies if this is known. Decoding doesn't report decode failures. The library internally prints: Corrupt JPEG data: premature end of data segment

But this doesn't appear to be available to the rust client - there's no indication of this after accessing, e.g., read_raw_data or calling finish_decompress.

Happy to supply a corrupted image example if it's useful.

dave-andersen avatar Oct 27 '20 20:10 dave-andersen

Which version of Rust are you using?

Does cargo test in this repo pass on your machine?

kornelski avatar Oct 28 '20 17:10 kornelski

cargo 1.47.0 (f3c7e066a 2020-08-28)

Cargo tests all pass.

The Cargo.lock for my test program was:

[[package]]
name = "mozjpeg"
version = "0.8.20"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c1d3a4f737389e9512b662e4f2c412761919994203874e2366afa97927865b27"
dependencies = [
 "arrayvec",
 "libc",
 "mozjpeg-sys",
 "rgb",
]

I've also retried by using a fresh clone of the github repo: mozjpeg = {version= "*", path="../mozjpeg-rust" } , and the results are the same.

i've attached the test image. A quick way to verify that it fails decoding with libturbojpeg is to run djpeg -outfile /dev/null file.jpg

839442a6aa3dcca42f4962e8ec87b999

dave-andersen avatar Oct 28 '20 19:10 dave-andersen

There is a test for error handling:

https://github.com/ImageOptim/mozjpeg-rust/blob/8b93d481fe3db02445ab55592cb94949fcd55170/tests/decode.rs#L20

All errors are handled by the same error handler, so I don't see why one error would work differently than another.

kornelski avatar Oct 28 '20 21:10 kornelski

Oh, I see. djpeg is not failing. It prints a warning, but still generates an image. The library has identical behavior.

kornelski avatar Oct 28 '20 21:10 kornelski

I see what you mean. But as a library, that warning info should be available to the programmer, not just get printed to stderr, no? (Did I miss a part of the API that provides access to that warning?)

(My usecase is that I'm trying to validate that a JPEG hasn't gotten corrupted, and this is one of the examples where it's getting corrupted but the handling of that error varies considerably by platform. As you note, djpeg just prints out a warning. The python bindings for libjpegturbo throw an exception. Python PIL silently accepts it. Chrome will render it with lots of blank grey. The go standard library jpeg decoder returns an error trying to parse it ('invalid JPEG format: missing 0xff00 sequence').

I'm completely sympathetic that there are times when presenting a partial decode is better -- a browser, for example -- but there are also times when it's awesome to be able to be strict.

dave-andersen avatar Oct 28 '20 23:10 dave-andersen

libjpeg makes this a warning callback:

https://github.com/mozilla/mozjpeg/blob/d23e3fc58613bc3f0aa395a8c73a2b1e7dae9e25/jerror.h#L267-L269

I haven't exposed a high-level rusty interface for this callback.

kornelski avatar Oct 29 '20 16:10 kornelski

This is an amazing quirk of libjpeg. It detects end of file, and then pretends there's infinitely more faked data in the file that happens to be EOI marker over and over again.

https://github.com/libjpeg-turbo/libjpeg-turbo/blob/c0412b56d66b287817009afca7c105feb674cd7c/jdatasrc.c#L111-L113

This code has been there since 1994, so this is very much a case of working as expected.

kornelski avatar Sep 16 '23 22:09 kornelski