image-png icon indicating copy to clipboard operation
image-png copied to clipboard

Add `Reader.read_row` for decoding into caller-provided buffer.

Open anforowicz opened this issue 1 year ago • 2 comments

Add Reader.read_row for decoding into caller-provided buffer.

Reader.read_row is an alternative to Reader.next_row.

  • Reader.next_row is convenient when the caller doesn't want to manage their own buffer, and doesn't mind having extended borrows on Reader to accommodate Row data.
  • OTOH the new Reader.read_row avoids an extra copy in some scanarios which may lead to better runtime performance - see the benchmark results below.

The PR adds a row-by-row/128x128-4k-idat benchmark (initially based on Reader.next_row which requires an extra copy). Using the new
Reader.read_row results in the following performance gains in the benchmark:

time: [-16.414% -16.196% -15.969%] (p = 0.00 < 0.05) time: [-15.570% -15.218% -14.856%] (p = 0.00 < 0.05) time: [-16.101% -15.864% -15.629%] (p = 0.00 < 0.05)

anforowicz avatar Aug 16 '24 14:08 anforowicz

This PR replaces https://github.com/image-rs/image-png/pull/421. (This PR can be seen as a non-breaking-change alternative of the other, earlier PR.)

anforowicz avatar Aug 16 '24 14:08 anforowicz

There doesn't seem to be any reason not to merge this. @anforowicz could you resolve the conflicts? I'll push the merge button as soon as that's done.

Shnatsel avatar Sep 18 '24 12:09 Shnatsel

@anforowicz could you resolve the conflicts?

Ack. I'll do that later today. My apologies for missing this message earlier.

There doesn't seem to be any reason not to merge this.

Let me share some of the lingering doubts I have regarding this PR:

  • Cons of this PR are small, but non-zero:
    • New public API (and public APIs are "forever"; or at least changing the APIs requires the friction of a new major version release and/or imposing some pain on crate users)
    • A bit of an unclear story wrt keeping 1) keeping next_row / next_interlaced_row and also read_row from this PR, vs 2) deprecating next_row / next_interlaced_row and only exposing read_row in some future major version release
  • Pros of this PR are unclear:
    • Helps to avoid an extra copy to widen the scanline in SkPngRustCodec::expandDecodedInterlacedRow, but interlaced images are relatively rare + SkPngRustCodec has other inefficiencies around processing them.
    • This PR definitely helps if the output of png doesn't need any further transformations before being usable by the crate client. But... I now think that maybe this is not the case for Skia/Chromium, which requires alpha-premultiplied output. Depending on png output:
      • Indexed: can't save the extra copy / palette expansion hop (this hop is done at the Skia/Chromium layer - see https://crbug.com/356882657)
      • Rgb, Gray or GrayAlpha: can't save the extra copy (need to transform into RGBA)
      • Rgba (~54% of images from top 500 websites):
        • Even with this PR there would still be a need for an extra transformation loop for alpha-premul (and sometimes for rgba=>bgra, or for gamma/iccp). (I have my brief notes about post-decoding transforms in a doc here.)
        • Maybe this PR can save a memcpy and then apply alpha-premul and/or rgba=>bgra transformation in-place. But this is tricky, because SkSwizzler doesn't support in-place transformations, so we'd have to switch to skcms_Transform in some cases.

anforowicz avatar Oct 29 '24 16:10 anforowicz

I think it's okay to merge this. Having read_row doesn't hurt, because even if we wanted to remove it, we can always just make a copy into the buffer.

kornelski avatar Dec 22 '24 13:12 kornelski