torchgeo icon indicating copy to clipboard operation
torchgeo copied to clipboard

Datamodule augmentation defaults

Open calebrob6 opened this issue 1 year ago • 8 comments

Summary

Currently the datamodules divide by 255 by default. This can confuse users who would expect no augmentations by default. We should make this WARNING level clear or not divide by default.

Rationale

No response

Implementation

No response

Alternatives

No response

Additional information

No response

calebrob6 avatar Feb 01 '24 19:02 calebrob6

For the record, torchvision also divides by 255 by default. But yeah, I'm fine with dividing by 1 by default and allowing data modules to override the std dev.

adamjstewart avatar Feb 01 '24 20:02 adamjstewart

They only divide by 255 if it is uint8, which makes more sense (but still trips me up sometimes).

calebrob6 avatar Feb 01 '24 20:02 calebrob6

I'm fine with changing the default to 1, just need to add a backwards incompatible tag to warn people. And convert all existing data modules to 255 (assuming that this is right). Want to make a PR?

Note that some of @lcoandrade's confusion in #1822 was that Raster Vision divides by 255 automatically but he wasn't sure if TorchGeo did too. Maybe he has opinions on this.

adamjstewart avatar Feb 01 '24 20:02 adamjstewart

Hi there!

I think the default behavior should be normalize to [0, 1] according to the data type used. This would avoid confusion.

I also think this information should be clear in the API. I could only check the normalization behavior checking the code on GitHub after @adamjstewart mentioned it to me.

lcoandrade avatar Feb 07 '24 13:02 lcoandrade

Just my two cents here: "Automatic" normalizing data is a really painfully experience when working of mostly of raw satellite data. A simple example, is the sentinel-2, where we have int32 data from the L2A data, but it shouldn't be divided by 65536 to be normalized, but through dividing the data by 10k, then clipping values to be [0-1].

One thing that might help is defaults to the image inputs expected be always 0-1, or maybe allow the user entry within a callable when initializing it to just handle the data -- this is something I always need to hack through to be able to use the libraries

johnnv1 avatar Feb 10 '24 18:02 johnnv1

I can do a PR with per-channel min-max scaling to [0,1] as the default augmentation.

I'm actually using this type of scaling for my own work but my implementation requires the user to have the global mins and maxes specified somewhere, as is currently the case with mean and standard deviation.

I think you should be responsible for your own data and knowing it's value range is not that harsh of a requirement in my opinion, so let me know if this OK and I can proceed with it.

DimitrisMantas avatar Feb 27 '24 23:02 DimitrisMantas

I can do a PR with per-channel min-max scaling to [0,1] as the default augmentation.

This isn't a good idea. We don't want batch-wise normalization, we want a single normalization parameter for the entire dataset.

Let's default to Normalization(mean=0, std=1) (i.e., subtract zero and divide by 1, i.e., do nothing). Then each data module can change this default. For backwards compatibility, we should probably set std=255 for most datasets, as this seems to be the most common range anyway.

adamjstewart avatar Feb 28 '24 09:02 adamjstewart

Ah, please excuse the confusion; by "per-channel" I meant that if you have an n-band raster, each of its channels gets normalised according to its own minimum and maximum values.

This approach was required in my case because half my bands are aerial unsigned 8-bit imagery whereas the remaining ones represent rasterised LiDAR attributes (e.g., reflectance, slope, etc.), each with its own unit of measurement and value range.

But it’s true that all batches are treated the same way and are scaled according to global parameters, like you mention.

That being said, I think doing nothing may be a more general approach.

DimitrisMantas avatar Feb 28 '24 20:02 DimitrisMantas