torchgeo icon indicating copy to clipboard operation
torchgeo copied to clipboard

Multiple roi in Sampler

Open Modexus opened this issue 3 years ago • 2 comments

I have an issue with multiple roi's passed to a Sampler.

I am trying to use the same train/val/test set as Tile2Vec. In the paper they split a singular roi in a 12x12 grid. Every grid cell is then assigned randomly to one of the three datasets.

The issue lies in that the sampler only take a single roi as an argument (the datasets as well). I can create the 12x12 grid but I am unable to assign multiple non-contigous regions to the train/val/test set.

Is it possible to somehow pass multiple regions as the roi argument (I am not familiar with rtree)? My current workaround is changing GeoSampler to handle mutiple regions.

Modexus avatar May 18 '22 09:05 Modexus

I think you're right that it isn't possible with the current setup. I'm open to suggestions, either allowing multiple roi in a GeoSampler, or some kind of dataset_split method that supports grids.

adamjstewart avatar May 18 '22 21:05 adamjstewart

Splitting the dataset directly does not seem feasible at the moment as the dataset does not allow for multiple rois (this would also be a good feature, for example using multiple regions of NAIP distributed over the country).

As mentioned my current workaround involves changing the GeoSampler to support multiple rois as well as using a roi_split_grid function that is passed to the sampler. This seems nice as the dataset does not have to be changed, just the samplers roi restricted (more generically as a roi_split function can be passed).

This does not work yet as it interferes with #537 but once I remove these it should.

Modexus avatar May 18 '22 21:05 Modexus

This feature was added by #866, see torchgeo.datasets.random_grid_cell_assignment.

adamjstewart avatar Sep 29 '23 20:09 adamjstewart