Support for CInt16 and CInt32 Sample Formats
Hi there!
I'd like to suggest adding support for the CInt16 and CInt32 sample formats to this library. These formats are used for complex integer samples, used in Synthetic Aperture Radar images such as Sentinel-1 Single Look Complex products, and are supported by the GDAL library. Adding support for these formats will enhance the library's ability to handle a wider range of GeoTIFF files, including those used in SAR applications.
About contributing: I am willing to contribute to the implementation of this feature, but I'm not sure where to start. Should this be added to the upstream tiff library, or as an extension here? I'd love to hear suggestions/requirements for this contribution 😄
Hi @vini-fda! Yes, I'd love to support complex types used in SAR imagery, mentioned briefly at https://github.com/georust/geotiff/pull/17#discussion_r1747997111. New dtypes should be added to the RasterData enum here:
https://github.com/georust/geotiff/blob/95172cb80e0eeb0bd6bb5409927aadef34464e9a/src/raster_data.rs#L4-L15
Should this be added to the upstream tiff library, or as an extension here?
Ideally yes, we should add this upstream, by making changes to the DecodingResult enum I think (this recent PR which added fp16 support at https://github.com/image-rs/image-tiff/pull/257 might be helpful as a reference). That said, the tiff crate has been slow on development/releases and there are known issues working with it (e.g. https://github.com/image-rs/image-tiff/issues/264), so I'm debating on whether #7 (refactoring on top of the tiff crate) is still a good idea, or we need to rely on a more active/maintained 'tiff' crate/library.
Oh okay, that seems doable! One question: why is RasterData an enum and not a struct with a type parameter T? IMO that would make it easier to just expand the implementation, and the type T could be restricted by a trait bound to only contemplate the types we want.
If you can find a way to simplify the RasterData implementation, please go ahead! We weren't thinking too hard on what was the 'right' way to do things in https://github.com/georust/geotiff/pull/17, it was more just a starting point 😉
Btw, if you could post a sample SLC file with complex dtypes (unsure if you can get it down to just one burst to make it small), and also the error you're getting when reading it with this crate, that would be helpful.
One question: why is
RasterDataan enum and not a struct with a type parameterT?
In an enum, your type doesn't have to be known at compile time. If you had an arbitrary read_geotiff command, which takes in a file path and returns data, you couldn't make it have a type parameter, because the user would have to know in advance, at compile time the type of the data to call the function. But we won't know the actual type of the data until we read the file metadata.
So really there are 3 options:
- An enum with the variants containing the data, as exists now.
- A "data type" enum, whose variants don't hold data, and then you make the user call e.g.
reader.read_data::<u8>(), and you check that the user's passed-in type matches the runtime type of the data. - You use trait objects and downcasting, so you return an opaque
Arc<dyn RasterData>, and then the user has to call a method likedata_type()to know how to downcast to a concrete type.
If you can find a way to simplify the
RasterDataimplementation, please go ahead! We weren't thinking too hard on what was the 'right' way to do things in #17, it was more just a starting point 😉Btw, if you could post a sample SLC file with complex dtypes (unsure if you can get it down to just one burst to make it small), and also the error you're getting when reading it with this crate, that would be helpful.
Sure! I'll look into how to improve the implementation of RasterData. I also will add tests with GeoTIFFs that have complex dtypes for sure, thanks for the suggestion!
One question: why is
RasterDataan enum and not a struct with a type parameterT?In an enum, your type doesn't have to be known at compile time. If you had an arbitrary
read_geotiffcommand, which takes in a file path and returns data, you couldn't make it have a type parameter, because the user would have to know in advance, at compile time the type of the data to call the function. But we won't know the actual type of the data until we read the file metadata.
Yeah, I forgot about that 😅 , that's a totally valid point.
So really there are 3 options:
1. An enum with the variants containing the data, as exists now. 2. A "data type" enum, whose variants don't hold data, and then you make the user call e.g. `reader.read_data::<u8>()`, and you check that the user's passed-in type matches the runtime type of the data. 3. You use trait objects and downcasting, so you return an opaque `Arc<dyn RasterData>`, and then the user has to call a method like `data_type()` to know how to downcast to a concrete type.
Honestly, I think keeping the enum implementation seems fine for now. My worry was that it sounds too cumbersome to add new functionality (or implementations of new methods), but that's the price to pay for dynamic data types without trait objects. Also, I found this discussion on the Rust Programming Language Forum relevant, especially regarding the trade-off between trait objects vs ADTs.
Hi! So I'm making a simple implementation of CInt16 & CInt32 over at this fork. I also added some features to the tiff crate, here. For now, I added a simple test SLC Image, and the integration test passes. I'd be glad to get any feedback 😃
Note: for reference, for the test I'm using an SLC Image over Mexico City, which can be obtained on Alaska Satellite Facility's data search. Alternatively, using the burst downloader:
wget -c --http-user=$ASF_USERNAME --http-password=$ASF_PASSWORD "https://sentinel1-burst.asf.alaska.edu/S1A_IW_SLC__1SSV_20151022T122539_20151022T122606_008265_00BA51_5A48/IW3/VV/2.zip"
I then just reduced the file size by running gdal_translate -srcwin 0 0 200 200 file.tif. I also didn't find a way to compare some additional metadata such as GCPs, tags like "TIFFTAG_IMAGEDESCRIPTION", etc with gdalinfo, so I didn't add it to the tests yet.
Hi! So I'm making a simple implementation of CInt16 & CInt32 over at this fork. I also added some features to the tiff crate, here. For now, I added a simple test SLC Image, and the integration test passes. I'd be glad to get any feedback 😃
Nice! For the changes you made to image-tiff, maybe open a PR upstream and see if they would be willing to accept it. On the geotiff side of things, I noticed you added a few extra things (e.g. a RasterValue enum, and a get_value_at_pixel method), which are both good to have but could probably be split off and merged separately before the Complex32/Complex64 related changes are added (will be easier for me and others to review small PRs than one big one).
P.S. I noticed you added a todo!() for f16 decoding, I can take a stab at that based on https://github.com/image-rs/image-tiff/pull/257 if there's interest in supporting that dtype. Edit: appears to be slightly trickier than I thought, maybe wait for f16 to stabilize in Rust first, xref https://github.com/rust-lang/rust/issues/116909
Hi @weiji14, I just created the PR with the other features. Sorry for the mess, I was just trying to get something to work locally with my fork, and I polluted my commit history with these extra features.
Edit: Also, before I make a PR upstream, do you think maybe using num_complex's Complex<T> would be better than creating new CInt types?
Thanks for the PR at #27, and sorry for the late reply (was busy with other work). I think going with num_complex is ok as you've already tried at https://github.com/vini-fda/geotiff/pull/1.
The trickier question is how to handle add DecodingResult::CInt16 and DecodingResult::CInt32 enum variants upstream to image-tiff as you've done at https://github.com/vini-fda/image-tiff/commit/f2a263114e15770c78cb5d7a90e459075da9bf68. Do you want to try to see if they're willing to accept a PR? Otherwise we may need to rebase the entire geotiff repo on another TIFF crate.