image-png
image-png copied to clipboard
Add `Reader.read_row` for decoding into caller-provided buffer.
Add Reader.read_row for decoding into caller-provided buffer.
Reader.read_row is an alternative to Reader.next_row.
Reader.next_rowis convenient when the caller doesn't want to manage their own buffer, and doesn't mind having extended borrows onReaderto accommodateRowdata.- OTOH the new
Reader.read_rowavoids an extra copy in some scanarios which may lead to better runtime performance - see the benchmark results below.
The PR adds a row-by-row/128x128-4k-idat benchmark (initially based on
Reader.next_row which requires an extra copy). Using the new
Reader.read_row results in the following performance gains in the
benchmark:
time: [-16.414% -16.196% -15.969%] (p = 0.00 < 0.05) time: [-15.570% -15.218% -14.856%] (p = 0.00 < 0.05) time: [-16.101% -15.864% -15.629%] (p = 0.00 < 0.05)
This PR replaces https://github.com/image-rs/image-png/pull/421. (This PR can be seen as a non-breaking-change alternative of the other, earlier PR.)
There doesn't seem to be any reason not to merge this. @anforowicz could you resolve the conflicts? I'll push the merge button as soon as that's done.
@anforowicz could you resolve the conflicts?
Ack. I'll do that later today. My apologies for missing this message earlier.
There doesn't seem to be any reason not to merge this.
Let me share some of the lingering doubts I have regarding this PR:
- Cons of this PR are small, but non-zero:
- New public API (and public APIs are "forever"; or at least changing the APIs requires the friction of a new major version release and/or imposing some pain on crate users)
- A bit of an unclear story wrt keeping 1) keeping
next_row/next_interlaced_rowand alsoread_rowfrom this PR, vs 2) deprecatingnext_row/next_interlaced_rowand only exposingread_rowin some future major version release
- Pros of this PR are unclear:
- Helps to avoid an extra copy to widen the scanline in
SkPngRustCodec::expandDecodedInterlacedRow, but interlaced images are relatively rare +SkPngRustCodechas other inefficiencies around processing them. - This PR definitely helps if the output of
pngdoesn't need any further transformations before being usable by the crate client. But... I now think that maybe this is not the case for Skia/Chromium, which requires alpha-premultiplied output. Depending onpngoutput:Indexed: can't save the extra copy / palette expansion hop (this hop is done at the Skia/Chromium layer - see https://crbug.com/356882657)Rgb,GrayorGrayAlpha: can't save the extra copy (need to transform into RGBA)Rgba(~54% of images from top 500 websites):- Even with this PR there would still be a need for an extra transformation loop for alpha-premul (and sometimes for rgba=>bgra, or for gamma/iccp). (I have my brief notes about post-decoding transforms in a doc here.)
- Maybe this PR can save a
memcpyand then apply alpha-premul and/or rgba=>bgra transformation in-place. But this is tricky, becauseSkSwizzlerdoesn't support in-place transformations, so we'd have to switch toskcms_Transformin some cases.
- Helps to avoid an extra copy to widen the scanline in
I think it's okay to merge this. Having read_row doesn't hurt, because even if we wanted to remove it, we can always just make a copy into the buffer.