torchgeo
torchgeo copied to clipboard
Maxar imagery dataset
https://www.maxar.com/products/satellite-imagery
It seems like the imagery isn't free/open-source, but they do have samples we could use to write a data loader: https://resources.maxar.com/product-samples
For anyone who works on this:
A good starting point is the System-Ready Stereo (1B), 8-band bundle, 50 cm | Rio de Janeiro, Brazil. The zip file download contains a "normal" Maxar scene with a directory structure as following
./056078906040/- I think this should be the root directory passed to the dataset class
./056078906040/056078906040_01_P001_PAN/- Contains the panchromatic bands broken up spatially into 1 or more TIF (and accompanying) files called "looks"
./056078906040/056078906040_01_P001_MUL/- Similar to the panchromatic directory, contains the multispectral bands broken up spatially into 1 or more TIF (and accompanying) files called "looks"
I also think the dataset object should parse the XML files in the _PAN and _MUL subdirectories to get information about the scene (off nadir angle, processing level, estimated cloud coverage, etc.).
Finally, it doesn't look like the TIFs are tiled by default, which will make windowed reading extremely slow. Users should be warned to convert the TIFs to COGs before making a dataset with them (e.g. if they create the Dataset with non-tiled TIFs maybe we should throw a warning).
We may want to add a warning message for any raster file that isn't a COG, that should be easy to do. Is there a similar cloud-optimized file format for vector files, or are shapefiles the best we can do?