USAVars Augmentation maps to 0
Description
In the USAVars Datamodule, the default augmentation from NonGeoDatamodule is used. However, the dataset returns uint8 data, and it comes out of the augmentation still as uint8. This means you get an error when trying to train but also that your input images are just all zeros.
Steps to reproduce
dm = USAVarsDataModule(root="path/to/usa_vars", batch_size=16)
dm.setup("fit")
dl = dm.train_dataloader()
batch = next(iter(dl))
aug_batch = dm.aug(batch)
print(aug_batch["image"].max())
Version
'0.5.0.dev0'
Also not sure to what extent other current datasets might have the same issue because it's a silent bug.
This is another reason I want #985. Although I don't think rasterio can statically infer the type of the file, so maybe that wouldn't help. We'd have to carefully annotate each read.
I am reopening this because I had two other people tell me they had the same thing happening with their custom datasets. Is there maybe a way we could implement a warning at some stage? https://github.com/microsoft/torchgeo/blob/4da988a9112b72f328e5bee6490e8f7ad3a784f0/torchgeo/transforms/transforms.py#L83 Or some other check, because it's a quiet sneaky phenomenon, that takes some digging to understand why.
We could certainly warn, but I want to make sure warnings are uncommon. I also really want to get rid of our AugmentationSequential wrapper and instead use the Kornia one. The biggest thing holding us back is https://github.com/kornia/kornia/issues/2119. I started working on this but quickly found it more difficult than I expected.
I think one of the following solutions would be fine:
- Implement https://github.com/kornia/kornia/issues/2119 and drop our
AugmentationSequential - Drop our
AugmentationSequentialand usedata_keyseverywhere - Remove type conversion from our
AugmentationSequentialand force datasets/datamodules/trainers to handle it - Add a warning when we convert dtypes
There are roughly sorted in order from "this is what I want long-term" to "this would be tolerable short-term".