Numerical Instability in metrics.py

Open DebasmitaGhose opened this issue 5 years ago • 1 comments

When I use metrics.py to evaluate a model using the same weight, I get different mIoU values for different runs.

I am using your DeepLab implementation as a backbone in another network and also using your evaluation code Below are 3 such runs, where metrics.py has been used to evaluate the model on the same validation set, using the same weights.

RUN 1

> 'Pixel Accuracy': 0.891,   
> 'Mean Accuracy': 0.755,  
> 'Frequency Weighted IoU': 0.810,  
> 'Mean IoU': 0.615,

RUN 2


> 'Pixel Accuracy': 0.896, 
> 'Mean Accuracy': 0.761,
>  'Frequency Weighted IoU': 0.819, 
> 'Mean IoU': 0.622,

RUN 3


>    "Pixel Accuracy": 0.882
>    "Mean Accuracy": 0.748,
>    "Frequency Weighted IoU": 0.798,
>    "Mean IoU": 0.609,

seems like its an issue of numerical instability. Particularly, I feel that either the _fast_hist function or the division in scores function in utils/metric.py file is the root cause.

Will greatly appreciate if you can provide some help here thank you!

Aug 06 '20 05:08 DebasmitaGhose

Is the output of your network consistent for each run? Let me make sure the inference is deterministic.

Aug 07 '20 04:08 kazuto1011