image-tiff
image-tiff copied to clipboard
Expose Image and friends to allow for Multithreading implementations
Tiff is used for reading GeoTiff and - by extension - COGs. Those all require reading tiffs and me and friends think image/tiff is the place to nag for adding cool features to make our lives easier. Otherwise, we'd all be implementing our own decoders on top of preloaded images and stuff. Especially in the use-case of reading COGs, where concurrent, partial reads of a tiff are needed. This is mainly a discussion starter, but I think tiff::decoder::Image should be polished a bit more and then made public.
Summary
Make Image and other types public to allow for easily extending the Decoder. Also provide examples and implementation of an extended reader.
Motivation
GeoTiff landscape is currently quite fragmented and most implementations are stuck on being able to decode weird tiff/compression types. Those issues actually belong in this crate. I think this crate should either:
- provide async/multithreaded support
- have an extensible API to be able to implement said support.
then all geo-related functionality (and only that) can be put into georust/geotiff#7. Also libtiff has a multilayered api and based on this comment, I'd assume this library is trying to be somewhat analogous to libtiff. Therefore, exposing the API at multiple levels of abstraction should fit within this crate?
Currently remaining uglyness
The Image struct may need some polishing before being published as a pub struct. Below are some ramblinations on what could be improved before exposing the Image struct.
- Currently, creating a Decoder from an
Imageis not hassle-free:byte_orderis in the reader,bigtiffin the decoder (but that is also not a necessary field one the metadata is loaded) and so the implementation ofChunkReaderuses all those. This info (byte_order and bigtiff) could be added to the Image struct. - Reading in the chunk offset/length fields in a large COG still takes quite some time, as shown here:
This could be circumvented by having an enum like:
That would allow for partial reads of these tags. Reading and representation of these tags is directly embedded in this crate, so an extender could not implement this by themselves./// Enum for partially-loaded tags for chunk_offset and chunk_bytes pub enum ChunkInfos { /// This tag has not been loaded yet, please read in the chunk you need Uninitialized(Entry), // Entry field for ergonomic initialization /// This tag has a minority of values read that are not necessarily close to each other Sparse(Entry, HashMap<u32, u64>), /// This tag has chunks read that form a sub-rectangle in the larger tiff /// assumes a rectangle from topleft-botright, where x and y difference (or rather I and J according to [GeoTiff Spec](https://docs.ogc.org/is/19-008r4/19-008r4.html#_device_space_and_geotiff)) is calculated from TileAttributes Rect{ entry: Entry, topleft: u32, botright: u32, data: Vec<u64> }, /// This tag is either entirely loaded, or has loaded enough data to be dense. `0` indicates a missing value. Dense(Vec<u64>) } impl ChunkInfos { fn get(chunk_index: u32) -> TiffResult<u64> { // logics } fn retrieve<R: Read + Seek>(chunk_index: u32, reader: R, byte_order: ByteOrder, bigtiff: bool) {//limits: Limits, // more logics? } /// Not sure fn retrieve_async<R: AsyncRead + AsyncSeek + Unpin>(//bla) { //more logics.await? } } - possibly that tiff::Image and image-rs::Image overlap in naming, callign for tiff::TiffImage.
- I could probably think of more possible objections, but would need actual feedback first.
- The name ChunkReader was co-developed with buzz from https://nurdspace.nl