mmsegmentation Beginner Question about Evaluation Metrics

I have a very beginner question about what the eval metrics actually mean. I can't quite seem to find anything in the docs (If I missed it, I'm sorry! I looked for quite awhile before asking, I promise).

Currently, I am gathering these metrics on validation and test: [aAcc, mAcc, mIoU, mDice, mFscore, mPrecision, mRecall]. I can use Google to figure out what these mean individually per image vs. ground truth... But what do they mean in the context of the val/test eval as a whole?

If we take aAcc as an example, calculated here. Does it take the sum of all class intersections with the gt, divided by the sum of the area of the gt?

I think my comprehension stops here. I have googled and tried to understand these few lines, but for some reason something is not clicking:

intersect = pred_label[pred_label == label] # I know this one!
area_intersect = torch.histc(
    intersect.float(), bins=(num_classes), min=0,
    max=num_classes - 1).cpu()
area_pred_label = torch.histc(
    pred_label.float(), bins=(num_classes), min=0,
    max=num_classes - 1).cpu()
area_label = torch.histc(
    label.float(), bins=(num_classes), min=0,
    max=num_classes - 1).cpu()
area_union = area_pred_label + area_label - area_intersect # This one is easy too!

Final Question!: To get the mIoU from the individual IoUs, this would just be averaging the IoU from every image in the test / val set? Do we take the average per class first vs. the gt, then average the samples?

I appreciate your time and help, thank you!

Feb 21 '24 23:02 yzb2

Taking mAcc and aAcc as examples, when the sample size is extremely unbalanced, there is a significant difference between mAcc and aAcc: mAcc simply calculates the mean Acc for each class, while aAcc treats all classes as one and calculates the Acc.

Mar 29 '24 02:03 Zoulinx

Thank you for your kind reply! With this help, I was able to google more and get a better understanding.

Jun 12 '24 21:06 yzb2