torchgeo
torchgeo copied to clipboard
Convert many NonGeoDatasets to GeoDatasets
During the GeoDataset
refactor (#37), all existing GeoDataset
s were moved to VisionDataset
. Now that the GeoDataset
API has settled down, we should attempt to convert many of these VisionDataset
s that have geospatial information back to GeoDataset
.
We may need a STACDataset
base class that subclasses GeoDataset
and describes how to pull geospatial information from STAC JSON files.
@recursix continuing the discussion from #353: yes, we should definitely convert many of our current VisionDatasets
to GeoDataset
. CV4A is definitely an obvious choice. If you have any interest in submitting a PR to convert CV4A to GeoDataset
, I'm happy to help and give pointers. If not, I'm sure someone will get to it eventually.
For other datasets that are pre-chipped, I haven't yet decided whether it makes sense to convert them to GeoDataset
(and require use of a GeoSampler
) or keep them as VisionDataset
(and allow integer indexing). On the one hand, it would be cool to combine those benchmark datasets with other geospatial data. On the other hand, it makes usage a bit more complicated due to the requirement of a GeoSampler
. Before we mass convert pre-chipped datasets, I think we'll want to compare sampling performance before and after, as GeoSampler
is known to be slower than we'd like.
Great, I'll see if I can unlock some time to contribute (very unsure at the moment). For datasets that are pre-chipped I think it still make sense to have the geolocation information available as well. The main use case I see is that someone may be interested in performing out of distribution evaluation by splitting train test based on geolocation.
It could be a simply added to Vision Dataset where there would be a way to obtain the box based on the sample.