image-tiff
image-tiff copied to clipboard
Extensibility: mid-level API and harmonizing encoder/decoder structs.
This is mainly a conversation starter. I myself was interested in having a Cloud-Optimized-Geotiff reader in Bevy, which led to #245. Now, @spoutn1k is working on harmonizing the API together with EXIF support over at #242. Both PRs, along with other issues:@fintelia said https://github.com/image-rs/image-tiff/issues/232#issuecomment-2109081033 multiple places that the encoding API needs an overhaul. Also in https://github.com/image-rs/image-tiff/issues/222, someone agreed with me that private fields in the Image make inspection difficult. The main point is that I think that overhaul would mean using a more config/property type of design, rather than a generics design https://github.com/image-rs/image-tiff/issues/205#issuecomment-1913396405 https://github.com/image-rs/image-tiff/issues/205.
How I would like to be able to build on top of image/tiff for GeoTiff (and hopefully also OME for others #228)
Basically in #246:
let reader = RangedStreamer::new(length, 30*1024, range_get);
// this decoder will read all necessary tags
let decoder = Decoder::new_overview_async(reader, 0).await.expect("oh noes!");
println!("initialized decoder");
let cloneable_decoder = tiff::decoder::ChunkDecoder::from_decoder(decoder);
and over at geotiff:
struct GeoImage {
image: tiff::Image, // Image is currently private-ish
geo_tags: MyStruct
// etc
}
What API/internal structure would help here
The What I think this library needs is:
- Decoder/Encoder harmonization: The API is different, the structs are different, everything is different.
- Extensibility: This library is made to build things on top of, and sort of provides API at different levels, but imho not as clearly as inspired by "libtiff: provides interfaces to image data at several layers of abstraction (and cost). At the highest level image data can be read [...] without regard for the underlying data organization, colorspace, or compression scheme. Below this high-level interface the library provides scanline-, strip-, and tile-oriented interfaces that return data decompressed but otherwise untransformed. [...] At the lowest level the library provides access to the raw uncompressed strips or tiles, returning the data exactly as it appears in the file."
- ASync-supporting (reader agnostic) design: Base the decoding on images on memreaders/u8 buffers, <- where the mid-level api resides. For high-level API (current) There is the wrapping Decoder full of convenience functions for easy reading and inspection.
Now, I think it would be a good idea to have three levels of wrapping structs that hold data:
/// Hold all Images, possibly decoded. Fields are public so its structure is part of the api. (reduces maintenance burden, increases risk of breaking changes)
/// This is the "wrapper" type of struct that other implementers would re-implement. Where we provide a basic API over the mid-level API,
/// Importantly, it holds all structural elemtents of a TIFF that don't belong in an encoder/decoder and allows for incrementally reading tiffs. Basically, Encoder/Decoder both write from/to this struct while encoding/decoding an image.
/// Just use u64, since we're not targetting super memory-constrained things
struct TIFF {
// ifd_offsets: Option<Vec<u64>> // was seen_ifds in Decoder, but I would like a scan_ifds() function
images: Vec<enum {Image(Image), Offset(u64)}> // indices should match, or enum
bigtiff: bool,
byte_order: ByteOrder,
}
///
/// IFD that contains an image as opposed to a sub-ifd (which also may contain an image, but we don't necessarily check there
struct Image<'a> {
// useful fast-access tags
// _not_ necessarily bigtiff or endianness
sub_ifds: Vec<IFD>
}
/// IFD that doesn't necessarily hold all tags, or even image data
struct<'a> IFD {
sub_ifds: Vec<IFD>, // vec ?is? a pointer, so ?no? infinity otherwise Vec<Box/Arc<IFD>>
inner: BTreeMap<Tag, Entry>
}
Then, in the mid-level API, (Image level) has static functions, and is a thin, ergonomic wrapper that can be re-implemented if other tag-reading logic is wanted. (I'd almost say, add an EasyDecoder struct that needs less initialization.
impl Decoder {
/// static method for reading in an IFD, are these on Image? If they are, I think they should be exposed there
pub fn read_ifd<R: Read + Seek>(reader: R, offset: u64) -> IFD {todo!;}
pub fn read_ifd_async<R: AsyncRead + AsyncSeek + Unpin> (reader: R, offset: u64) -> IFD {todo!().await}
/// scans over the linked list of ifds without reading them
pub fn scan_images(&mut self) {
// same as next_ifd, but doesn't read and goes on completely
self.ifd_offsets = todo!();
}
/// read in all ifds of all images
pub fn read_image_ifds<R: Read + Seek>(&self) {
if self.ifd_offsets.is_none() {self.scan_ifds();} //should be unnecessary
let Some(offsets) = self.ifd_offsets else { unreachable!(); }
let self.images = offsets.map(|offset| self.read_image_ifd(offset)).collect();
}
}
- Document high-level vs mid/low-level api, which - I think - would
- Publish
Imagefor debugging(#222)/wrapping purposes(#246)- For this, it should be clear what
Imageis. Now, there are multiple use-cases for image:- (partially) decoding a tiff <- the Directory<Tag, Entry> holds offsets and/or
Values. - encoding a tiff or further processing <- the Directory<Tag, DecodedEntry> holds only
Values- How to handle sub-IFDs in this case? Should we have something like:
Then DecodedEntry holds apub struct DecodedIfd<'a> { ifd: BTreeMap<Tag, DecodedEntry>; sub_ifds: vec<&'a DecodedIfd> } /// An Image that has retrieved all IFD info when decoding, or for encoding pub struct DecodedImage<'a> { sub_ifds: Vec<&'a DecodedIfd> // or a more "basic" sub-ifd type }Value::IFD(sub_ifds.find(the_ifd))orValue::IFD8(vec_index). Allows for recursion? Or Box something? I would say: "the [Image] struct holds all necessary metadata for decoding an image from an (async) decoder" <- tyingImageandDecodercloser together, basically I'm trying to avoid a big maintenance
- How to handle sub-IFDs in this case? Should we have something like:
- (partially) decoding a tiff <- the Directory<Tag, Entry> holds offsets and/or
- For this, it should be clear what
- More control over the decoding process: Allow reading an image without directly reading all tags (#85) or read them all (#243 #242 yay tiff magic).
- This would also allow reading just an IFD, without needing to read in all things
Differences between Encoder/Decoder
Main difference is that the Encoder uses generics for SampleType (RGB8/RFloat64-type images)/TiffKind "bigtiffnes", whereas decoder uses properties/fields. As suggested elsewhere, fields is the preferred way forward, but this begs several questions.
- How badly should we break the API?
- new_image::<RBG8>() is very ergonomic, much better than manually setting fields. However, it leads to big manual lists of each and every type with its sampledepth etc due to combinatorial explosion.
- I would say, keep the API, and create an Image/TIFF from it that holds the information as properties.
e.g.
impl ImageEncoder { // no generics
fn new_image<T: SampleType>() {
let image = Image::new()
/// set the Image's values based on T,
image.set_tag(Tag::SamplesPerPixel, T::SAMPLES_PER_PIXEL)
}
}
It's a bit rough atm, but here's the idea!