FLAIRHUB dataset (mono-temporal)
The FLAIR-HUB dataset is a large-scale, multimodal dataset for land cover and crop mapping from the French National Institute of Geographical and Forest Information (IGN). It builds upon and includes the FLAIR#1 and FLAIR#2 datasets, expanding them into a unified resource with very-high-resolution annotations.
This PR adds the FLAIRHUB mono-temporal part of the dataset.
Let me know if anything has to changed. the monotemporal part of the dataset is already very solid and can be used independently. I plan to work on the time-series part of the dataset once it will be implemented in Torchgeo.
Note
As the dataset size is huge ( total of ~750 GB) I wasn't able to test training on the full dataset but it is possible to work on a subset of the dataset by commenting most of the domain years dict in Torchgeo/datasets/flairhub.py#L174)
Acknowledgments
Special thanks to the work done in #2394, which served as a starting point for this implementation.
Todo
- [ ] Implement the time-series part of the dataset
- [ ] Add the historical part of the dataset
Sorry for the repost! I accidentally used the main branch on my fork for this PR. I needed to clean up my fork to work on other contributions.
@vbuchauer is interested in this dataset for time-series benchmarking. How hard would it be to add the multi-temporal version of the dataset? We're in the process of converting all time-series datasets to B x T x C x H x W, and FLAIR-HUB seems like a good candidate for this.
I definitely want to implement the time-series part of this. I heard in the discussion that the time-series implementation is ongoing, is there already a sample dataset I could base the behavior on so I can match the pattern on FlairHub?
Yes, PASTIS is a good example. As long as the dataset returns T x C x H x W data, then it should be compatible.