torchgeo icon indicating copy to clipboard operation
torchgeo copied to clipboard

Add augmentation to USAVars Dataset from paper code base

Open nilsleh opened this issue 1 year ago • 12 comments

This PR adds the Resize Augmentation from the Paper code base found here: https://github.com/Global-Policy-Lab/mosaiks-paper/blob/master/code/analysis/1_feature_extraction/2_featurize_models_deep_pretrained.py

nilsleh avatar Jun 20 '23 15:06 nilsleh

Poking around the code, I also see:

Not sure which of these are actually run or the code just exists for.

@calebrob6 why did we call this dataset USAVars instead of MOSAIKS?

adamjstewart avatar Jun 20 '23 16:06 adamjstewart

MOSAIKS is the name of a method (Multi-task Observation using Satellite Imagery & Kitchen Sinks (MOSAIKS)) that can be applied generally. USAVars is a better name for a dataset.

calebrob6 avatar Jun 20 '23 23:06 calebrob6

Poking around the code, I also see:

I want to use this dataset for a project and am trying to reproduce the reported results they have with a lightning setup instead of their big custom code base and will report which augmentations are needed to reproduce their scores.

nilsleh avatar Jun 21 '23 08:06 nilsleh

Computed Image statistics on torchgeo train dataset split:

min: array([0., 0., 0., 0.], dtype=float32)
max: array([1., 1., 1., 1.], dtype=float32)
mean: array([0.4101762, 0.4342503, 0.3484594, 0.5473533], dtype=float32)
std: array([0.17361328, 0.14048962, 0.12148701, 0.16887303], dtype=float32)

quiet different from the imagenet stats they use: mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]

nilsleh avatar Jun 21 '23 09:06 nilsleh

I think those normalizations are unique to the MOSAIK model they use. But these are the augmentations for CNN based approach.

nilsleh avatar Jun 21 '23 11:06 nilsleh

In that case should we add RandomHorizontalFlip and ImageNet normalization?

adamjstewart avatar Jun 21 '23 15:06 adamjstewart

yeah, I want to try and reproduce results first and will update the PR here then.

nilsleh avatar Jun 22 '23 09:06 nilsleh

@calebrob6 do the train/val/test splits that come with the torchgeo dataset version, correspond to any of the checkerboard style splits as seen in Figure 3 of the Mosaik paper or are these random splits?

Additionally, target variable normalization is also relevant for regression tasks. This is done here in their code. Should we add this target variable normalization as well, or at least document the mean/std values somewhere so people don't have to compute these values themselves?

nilsleh avatar Jun 23 '23 07:06 nilsleh

I'm pretty sure they are random splits.

Also, it looks like the download isn't working (the storage account permissions were automatically switched from anonymous access to private), so I need to move this to huggingface.

Also, this isn't an exactly replication of their dataset as they used Google Earth imagery (I think) while this is NAIP imagery.

calebrob6 avatar Jun 23 '23 15:06 calebrob6

With a resnet18 baseline I get 0.95 R-Squared score for treecover (paper 0.91) when doing proper normalization. Since we cannot replicate their results directly anyway as Caleb pointed out, I would suggest to just use the computed normalization statistics on this dataset, and I think adding support for target value normalization would be good as well.

nilsleh avatar Jun 29 '23 12:06 nilsleh

Hi, just saw this. Chiming in on a few things and please let me know if I can be helpful with anything else @nilsleh!

  1. Yes we do target variable normalization as is standard for regression. Note also that some of the target variables are transformed as y_transformed = log(1+y) (and performance is then reported with respect to the logged variables).
  2. As Caleb pointed out, the USAVars data here is based on NAIP imagery whereas the analysis in our paper is based on google imagery, so unfortunately don't expect the results to match up exactly with the numbers in the paper.
  3. In light of ^, if choosing to resize the imagery during preprocessing (or not), there is likely going to be a different optimal patch size for the NAIP imagery than for the imagery we use in the paper.
  4. It's possible that a different preprocessing of the images would be helpful for the CNN baseline or for MOSAIKS -- especially in light of these results: https://arxiv.org/abs/2305.13456. At the time of doing the experiments, we did what made the most sense for a solid and reasonable baseline: ZCA whitening for RCF (implemented here) following the explanation in footnote 14 here) and standard augmentation strategies for the Resnet-18 model as you've noted above.

estherrolf avatar Jul 11 '23 15:07 estherrolf

@nilsleh should we try to sneak this into v0.5.2?

adamjstewart avatar Feb 29 '24 12:02 adamjstewart