image-tiff icon indicating copy to clipboard operation
image-tiff copied to clipboard

Extensibility: mid-level API and harmonizing encoder/decoder structs.

Open feefladder opened this issue 1 year ago • 2 comments

This is mainly a conversation starter. I myself was interested in having a Cloud-Optimized-Geotiff reader in Bevy, which led to #245. Now, @spoutn1k is working on harmonizing the API together with EXIF support over at #242. Both PRs, along with other issues:@fintelia said https://github.com/image-rs/image-tiff/issues/232#issuecomment-2109081033 multiple places that the encoding API needs an overhaul. Also in https://github.com/image-rs/image-tiff/issues/222, someone agreed with me that private fields in the Image make inspection difficult. The main point is that I think that overhaul would mean using a more config/property type of design, rather than a generics design https://github.com/image-rs/image-tiff/issues/205#issuecomment-1913396405 https://github.com/image-rs/image-tiff/issues/205.

How I would like to be able to build on top of image/tiff for GeoTiff (and hopefully also OME for others #228)

Basically in #246:

let reader = RangedStreamer::new(length, 30*1024, range_get);
    
    // this decoder will read all necessary tags
    let decoder =  Decoder::new_overview_async(reader, 0).await.expect("oh noes!");
    println!("initialized decoder");
    let cloneable_decoder = tiff::decoder::ChunkDecoder::from_decoder(decoder);

and over at geotiff:

struct GeoImage {
  image: tiff::Image, // Image is currently private-ish
  geo_tags: MyStruct
  // etc
}

What API/internal structure would help here

The What I think this library needs is:

  • Decoder/Encoder harmonization: The API is different, the structs are different, everything is different.
  • Extensibility: This library is made to build things on top of, and sort of provides API at different levels, but imho not as clearly as inspired by "libtiff: provides interfaces to image data at several layers of abstraction (and cost). At the highest level image data can be read [...] without regard for the underlying data organization, colorspace, or compression scheme. Below this high-level interface the library provides scanline-, strip-, and tile-oriented interfaces that return data decompressed but otherwise untransformed. [...] At the lowest level the library provides access to the raw uncompressed strips or tiles, returning the data exactly as it appears in the file."
  • ASync-supporting (reader agnostic) design: Base the decoding on images on memreaders/u8 buffers, <- where the mid-level api resides. For high-level API (current) There is the wrapping Decoder full of convenience functions for easy reading and inspection.

Now, I think it would be a good idea to have three levels of wrapping structs that hold data:

/// Hold all Images, possibly decoded. Fields are public so its structure is part of the api. (reduces maintenance burden, increases risk of breaking changes)
/// This is the "wrapper" type of struct that other implementers would re-implement. Where we provide a basic API over the mid-level API,
/// Importantly, it holds all structural elemtents of a TIFF that don't belong in an encoder/decoder and allows for incrementally reading tiffs. Basically, Encoder/Decoder both write from/to this struct while encoding/decoding an image.
/// Just use u64, since we're not targetting super memory-constrained things
struct TIFF {
   // ifd_offsets: Option<Vec<u64>> // was seen_ifds in Decoder, but I would like a scan_ifds() function
   images: Vec<enum {Image(Image), Offset(u64)}> // indices should match, or enum 
   bigtiff: bool,
   byte_order: ByteOrder,
}

/// 
/// IFD that contains an image as opposed to a sub-ifd (which also may contain an image, but we don't necessarily check there
struct Image<'a> {
  // useful fast-access tags
  // _not_ necessarily bigtiff or endianness
  sub_ifds: Vec<IFD>
}

/// IFD that doesn't necessarily hold all tags, or even image data
struct<'a> IFD {
  sub_ifds: Vec<IFD>, // vec ?is? a pointer, so ?no? infinity otherwise Vec<Box/Arc<IFD>>
  inner: BTreeMap<Tag, Entry>
}

Then, in the mid-level API, (Image level) has static functions, and is a thin, ergonomic wrapper that can be re-implemented if other tag-reading logic is wanted. (I'd almost say, add an EasyDecoder struct that needs less initialization.

impl Decoder {
  /// static method for reading in an IFD, are these on Image? If they are, I think they should be exposed there
  pub fn read_ifd<R: Read + Seek>(reader: R, offset: u64) -> IFD {todo!;}
  pub fn read_ifd_async<R: AsyncRead + AsyncSeek + Unpin> (reader: R, offset: u64) -> IFD {todo!().await}

  
  /// scans over the linked list of ifds without reading them
  pub fn scan_images(&mut self) {
    // same as next_ifd, but doesn't read and goes on completely
    self.ifd_offsets = todo!();
  }
  /// read in all ifds of all images
  pub fn read_image_ifds<R: Read + Seek>(&self) {
    if self.ifd_offsets.is_none() {self.scan_ifds();} //should be unnecessary
    let Some(offsets) = self.ifd_offsets else { unreachable!(); }
    let self.images = offsets.map(|offset| self.read_image_ifd(offset)).collect();
  }
}
  • Document high-level vs mid/low-level api, which - I think - would
  • Publish Image for debugging(#222)/wrapping purposes(#246)
    • For this, it should be clear what Image is. Now, there are multiple use-cases for image:
      • (partially) decoding a tiff <- the Directory<Tag, Entry> holds offsets and/or Values.
      • encoding a tiff or further processing <- the Directory<Tag, DecodedEntry> holds only Values
        • How to handle sub-IFDs in this case? Should we have something like:
          pub struct DecodedIfd<'a> {
            ifd: BTreeMap<Tag, DecodedEntry>;
            sub_ifds: vec<&'a DecodedIfd>
          }
          /// An Image that has retrieved all IFD info when decoding, or for encoding 
          pub struct DecodedImage<'a> {
            sub_ifds: Vec<&'a DecodedIfd> // or a more "basic" sub-ifd type
          }
          
          Then DecodedEntry holds a Value::IFD(sub_ifds.find(the_ifd)) or Value::IFD8(vec_index). Allows for recursion? Or Box something? I would say: "the [Image] struct holds all necessary metadata for decoding an image from an (async) decoder" <- tying Image and Decoder closer together, basically I'm trying to avoid a big maintenance
  • More control over the decoding process: Allow reading an image without directly reading all tags (#85) or read them all (#243 #242 yay tiff magic).
    • This would also allow reading just an IFD, without needing to read in all things

Differences between Encoder/Decoder

Main difference is that the Encoder uses generics for SampleType (RGB8/RFloat64-type images)/TiffKind "bigtiffnes", whereas decoder uses properties/fields. As suggested elsewhere, fields is the preferred way forward, but this begs several questions.

  • How badly should we break the API?
    • new_image::<RBG8>() is very ergonomic, much better than manually setting fields. However, it leads to big manual lists of each and every type with its sampledepth etc due to combinatorial explosion.
    • I would say, keep the API, and create an Image/TIFF from it that holds the information as properties.

e.g.

impl ImageEncoder { // no generics
  fn new_image<T: SampleType>() {
     let image = Image::new()
     /// set the Image's values based on T, 
     image.set_tag(Tag::SamplesPerPixel, T::SAMPLES_PER_PIXEL)
  }
}

It's a bit rough atm, but here's the idea!

feefladder avatar Oct 03 '24 13:10 feefladder