torchgeo icon indicating copy to clipboard operation
torchgeo copied to clipboard

FLAIRHUB dataset (mono-temporal)

Open gatienc opened this issue 2 weeks ago • 4 comments

The FLAIR-HUB dataset is a large-scale, multimodal dataset for land cover and crop mapping from the French National Institute of Geographical and Forest Information (IGN). It builds upon and includes the FLAIR#1 and FLAIR#2 datasets, expanding them into a unified resource with very-high-resolution annotations.

This PR adds the FLAIRHUB mono-temporal part of the dataset.

image

Let me know if anything has to changed. the monotemporal part of the dataset is already very solid and can be used independently. I plan to work on the time-series part of the dataset once it will be implemented in Torchgeo.

Note

As the dataset size is huge ( total of ~750 GB) I wasn't able to test training on the full dataset but it is possible to work on a subset of the dataset by commenting most of the domain years dict in Torchgeo/datasets/flairhub.py#L174)

Acknowledgments

Special thanks to the work done in #2394, which served as a starting point for this implementation.

Todo

  • [ ] Implement the time-series part of the dataset
  • [ ] Add the historical part of the dataset

gatienc avatar Dec 08 '25 20:12 gatienc

Sorry for the repost! I accidentally used the main branch on my fork for this PR. I needed to clean up my fork to work on other contributions.

gatienc avatar Dec 08 '25 20:12 gatienc

@vbuchauer is interested in this dataset for time-series benchmarking. How hard would it be to add the multi-temporal version of the dataset? We're in the process of converting all time-series datasets to B x T x C x H x W, and FLAIR-HUB seems like a good candidate for this.

adamjstewart avatar Dec 09 '25 12:12 adamjstewart

I definitely want to implement the time-series part of this. I heard in the discussion that the time-series implementation is ongoing, is there already a sample dataset I could base the behavior on so I can match the pattern on FlairHub?

gatienc avatar Dec 09 '25 22:12 gatienc

Yes, PASTIS is a good example. As long as the dataset returns T x C x H x W data, then it should be compatible.

adamjstewart avatar Dec 10 '25 08:12 adamjstewart