torchgeo icon indicating copy to clipboard operation
torchgeo copied to clipboard

Document significance of macro vs micro averaging

Open robmarkcole opened this issue 1 year ago • 0 comments

Issue

In the segmentation trainer, MulticlassAccuracy MulticlassJaccardIndex uses average="micro" rather than the torchmetrics default macro.

From slack: Macro weights per patch (what is the average accuracy of each patch?), whereas micro weights per pixel (what is the average accuracy of each pixel across the entire dataset?). These two metrics very greatly when the number of nodata pixels varies between patches. If one patch is mostly nodata pixels, and another patch is mostly data pixels, then macro will take the average of the two patches, while micro will sum up all pixels

Macro might be okay for some applications (such as pre-chipped datasets) but is definitely not what you want for others (such as geospatial datasets)

Further discussion: proposal: add both macro and micro and just name them appropriately OA vs AA - Overall (micro) Accuracy (OA). Average (macro) Accuracy (AA)

Fix

Refine the description above and document at https://torchgeo.readthedocs.io/en/stable/api/trainers.html#torchgeo.trainers.SemanticSegmentationTask.configure_metrics

robmarkcole avatar Feb 09 '24 13:02 robmarkcole