image icon indicating copy to clipboard operation
image copied to clipboard

Proposal: Have `GenericImage` only abstract over `ImageBuffer` and `SubImage`

Open fintelia opened this issue 4 months ago • 5 comments

The GenericImage trait has been in a weird place for a while. The way it is currently defined, it isn't really possible to write efficient algorithms using it (#2300), but at the same time if we impose too many layout constraints then it just ends up being redundant with the ImageBuffer struct (#1118, probably others).

But if we made the trait abstract over both ImageBuffer and sub-images of ImageBuffers, then I think the trait could be useful. It would mean all our image processing algorithms would work on both without requiring inefficient pixel-by-pixel accesses to implement them. The traits would look something like:

pub trait GenericImage {
    type Pixel: Pixel;
    fn buffer(&self) -> &[<Self::Pixel as Pixel>::Subpixel];
    fn dimensions(&self) -> (u32, u32);
    fn row_stride(&self) -> usize;

    // Provided methods
    fn get_pixel(&self, ...) -> &Self::Pixel { ... }
    fn sub_image(&self, ...) -> SubImage<Self::Pixel, ...> { ... }
    ...
}

pub trait GenericImageMut: GenericImage {
    fn buffer_mut(&mut self) -> &mut [<Self::Pixel as Pixel>::Subpixel];

    // Provided methods
    ...
}

(This also switches GenericImageView -> GenericImage and GenericImage -> GenericImageMut, which I think is more aligned with naming conventions, but I don't feel strongly about.)

The SubImage struct could then be something like this (or be broken into separate SubImage and SubImageMut types):

pub struct SubImage<P: Pixel, Buffer: Deref<Target = [P::Subpixel]>> {
    inner: Buffer,
    dimensions: (u32, u32),
    row_stride: usize,
    _phantom: std::marker::PhantomData<P>,
}

impl<P: Pixel, Inner: Deref<Target = [P::Subpixel]>> GenericImage for SubImage<P, Inner> { ... }
impl<P: Pixel, Inner: Deref<Target = [P::Subpixel]> + DerefMut> GenericImageMut for SubImage<P, Inner> { ... }

fintelia avatar Aug 24 '25 22:08 fintelia

As a note while I ponder this, <SubImage as ImageView>::buffer would behave kind of weirdly as returning a view containing pixels that the subiumage does not logically contain. In the mutable case this also means we can't internally work around the aliasing-2d-image-mut-view problem by some kind of unsafe type that uses directly *mut [P::Subpixel].

197g avatar Aug 25 '25 21:08 197g

Yeah, the samples in each row between width * CHANNELS and row_stride would not actually be part of the sub-image. That's not ideal, but I didn't see a good way to avoid it while having an ergonomic API. I assume algorithms are going to want to mutably access multiple rows at once (or at least mutably access one row while immutably accessing a different row).

fintelia avatar Aug 25 '25 23:08 fintelia

The problem with that sample layout is its pitch information is not represented in the value and row stride as a solo method is, personally, a bit odd. We no longer have bounds (thankfully) but also no means of calculating the required pitches on the consumer side. Maybe the return type should be View<'_ [P::Subpixel>, P> instead which already has the members for it. Also that should imply we also implement the trait for it. That is basically a 2d slice without any of the aliasing / split optimizations.

197g avatar Aug 26 '25 09:08 197g

I'm not sure I follow what the distinction is. If we exposed GenericImage::view instead, then:

  • img.view().samples() and img.view().image_slice() would return exactly what img.buffer() did.
  • img.view().flat().layout.height_stride would be exactly what img.row_stride() was.
  • We'd have channel_stride=1 and width_stride=P::CHANNELS.

The last point is the important one for this proposal. By forcing densely packed pixels within a row, algorithm implementations can be much more efficient.

fintelia avatar Aug 30 '25 18:08 fintelia

The idea of returning view actually comes from a much older issue that got take up in #2417. I admit, it was a bit of leading question. I was wondering if we could not gain the major performance benefits of this restriction without a major redesign if we simply had a method return an optional view, i.e. the View type is a much more concrete statement about the layout than implicitly saying we implement the trait only for certain types. At least for all the common layouts implemented in image right now this would provide the benefits of having a codepath work on primitive types and numbers—and if such code paths specialize on channel_stride=1 they should get statically optimized when they are being monomorphized.

197g avatar Aug 31 '25 01:08 197g