VHR10 uses 'masks' as data_key, not 'mask'
Description
As title. All other datasets uses mask and this is supported by kornia, and the segmentation trainer requires it. This should be addressed by converting references of masks to mask. I understand AugPipe will be removed in https://github.com/microsoft/torchgeo/pull/1978 but collate_fn_detection is also affected.
Not sure how this is related but there appears to be a mask per box annotated:
image torch.Size([3, 563, 792])
boxes torch.Size([3, 4])
labels torch.Size([3])
mask torch.Size([3, 563, 792])
Appears the mask itself is not encoding the class value, so I gather this is not meant to be used with the segmentation trainer afterall
Steps to reproduce
NA
Version
main
You're correct in that the masks field is not to be used with the segmentation trainer; you need to use the mask field for that. Currently, within torchgeo there is a distinction between masks and mask.
maskis used to denote semantic segmentation masks. These are tensors that encode class values.masksare used to denote instance segmentation masks. These are (N, H, W) tensors where N is the number of unique objects/instances detected.
This distinction is present because kornia did not support instance masks, so there was no way to distinguish between the two. So, this was introduced in our custom AugmentationSequential to ensure both keys were handled appropriately.
This support is being added to kornia, and with it's next release, we should hopefully be able to finally remove the masks field and our custom AugmentationSequential.
Awaiting https://github.com/microsoft/torchgeo/pull/1978
Fixed in #1978 or one of the later PRs.