gluon-cv
gluon-cv copied to clipboard
void label not correctly ignored in metrics/voc_segmentation.py ?
I'm not sure if I'm misunderstanding something or noticed a bug in https://github.com/dmlc/gluon-cv/blob/master/gluoncv/utils/metrics/voc_segmentation.py
The function batch_intersection_union
adds 1 to the target array and then performs a target > 0
operation on it. My understanding is that the > 0
comparison is to ignore void labeled pixels, however I think the addition of 1 prior to the comparison would circumvent this. The function intersectionAndUnion
in the same file performs the same comparison but does not add 1 to the target/label array.
@zhanghang1989 Can you have a look?
Actually... is it even correct to be ignoring pixels labeled as background
? Pixels labeled ambiguous
(255) are obviously to be ignored, but I would have thought pixels with ground truth label background
that are predicted as some other class are considered as false positive.
The PASCAL VOC metric description according to the Cityscapes benchmark reads
"we rely on the standard Jaccard Index, commonly known as the PASCAL VOC intersection-over-union metric IoU = TP ⁄ (TP+FP+FN) [1], where TP, FP, and FN are the numbers of true positive, false positive, and false negative pixels, respectively, determined over the whole test set. ... pixels labeled as void do not contribute to the score."
Does void here refer to the ambiguous
or background
class?
In any case, perhaps it would be good to make ignoring background
pixels an option.
The label -1 is ignored during the training and evaluation.
So after wondering where a -1 label would be coming from (last I heard indexed mode images usually use unsigned integers for their indices) I did some digging and found in data/pascal_voc/segmentation.py
:
def _mask_transform(self, mask):
target = np.array(mask).astype('int32')
target[target == 255] = -1
return F.array(target, cpu(0))
Which explains where the -1 comes from. Would be good to have some more comments so others looking to work with or contribute to Gluon CV know what's happening. In this case a single comment in batch_intersection_union
would have saved us all a lot of time ;).
Also, perhaps the intersectionAndUnion
and pixelAccuracy
functions should also add 1 in case they're passed a label/target that has been transformed? Or at least the function documentation should perhaps be updated to indicate the expected input types.
Thanks for the comments! I agree and will add more comments about the util functions.