torchmetrics Performance difference between `v0.9.3` <-> `v0.10.0`

🐛 Bug

Torchmetrics works on different speed accross v0.9.3 and v0.10.0

To Reproduce

Same metric calculation on different versions installed environments.

Results for v0.9.3

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|  Action                                                                                                                                                                                                             	|  Mean duration (s)	|  Num calls      	|  Total time (s) 	|  Percentage %   	|
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|  Total                                                                                                                                                                                                              	|  -              	|  1042           	|  17.113         	|  100 %          	|
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|  [LightningDataModule]SegmentationDataModule.prepare_data                                                                                                                                                           	|  5.1728         	|  1              	|  5.1728         	|  30.228         	|
|  run_training_epoch                                                                                                                                                                                                 	|  4.6598         	|  1              	|  4.6598         	|  27.23          	|
|  [Strategy]SingleDeviceStrategy.validation_step                                                                                                                                                                     	|  0.28901        	|  12             	|  3.4681         	|  20.266         	|
|  [LightningDataModule]SegmentationDataModule.setup                                                                                                                                                                  	|  2.2177         	|  1              	|  2.2177         	|  12.959         	|
|  run_training_batch                                                                                                                                                                                                 	|  0.16763        	|  10             	|  1.6763         	|  9.7954         	|
|  [LightningModule]Model.optimizer_step                                                                                                                                                                             	|  0.16661        	|  10             	|  1.6661         	|  9.7361         	|
|  [TrainingEpochLoop].train_dataloader_next                                                                                                                                                                          	|  0.13849        	|  10             	|  1.3849         	|  8.0928         	|
|  [Strategy]SingleDeviceStrategy.backward                                                                                                                                                                            	|  0.077044       	|  10             	|  0.77044        	|  4.5021         	|
|  [Strategy]SingleDeviceStrategy.training_step                                                                                                                                                                       	|  0.056393       	|  10             	|  0.56393        	|  3.2954         	|
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Results for v0.10.0

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|  Action                                                                                                                                                                                                             	|  Mean duration (s)	|  Num calls      	|  Total time (s) 	|  Percentage %   	|
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|  Total                                                                                                                                                                                                              	|  -              	|  1042           	|  85.772         	|  100 %          	|
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|  run_training_epoch                                                                                                                                                                                                 	|  62.396         	|  1              	|  62.396         	|  72.747         	|
|  [Strategy]SingleDeviceStrategy.validation_step                                                                                                                                                                     	|  3.159          	|  12             	|  37.908         	|  44.196         	|
|  run_training_batch                                                                                                                                                                                                 	|  3.6079         	|  10             	|  36.079         	|  42.064         	|
|  [LightningModule]Model.optimizer_step                                                                                                                                                                             	|  3.6072         	|  10             	|  36.072         	|  42.056         	|
|  [Strategy]SingleDeviceStrategy.training_step                                                                                                                                                                       	|  3.5262         	|  10             	|  35.262         	|  41.112         	|
|  [LightningDataModule]SegmentationDataModule.prepare_data                                                                                                                                                           	|  5.1832         	|  1              	|  5.1832         	|  6.043          	|
|  [LightningDataModule]SegmentationDataModule.setup                                                                                                                                                                  	|  2.2608         	|  1              	|  2.2608         	|  2.6358         	|
|  [TrainingEpochLoop].train_dataloader_next                                                                                                                                                                          	|  0.1303         	|  10             	|  1.303          	|  1.5191         	|
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Code sample

Since its private project, I can only say the used metric names and configurations

For training:


metric_params = {
    "num_classes": self.num_classes,
    "average": None,
    "mdmc_average": "samplewise",
}

train = tm.MetricCollection(
    {
        "IoU": tm.JaccardIndex(**metric_params),
        "DSC": tm.Dice(**metric_params),
    },
    prefix="Train/Seg/",
)

validation  tm.MetricCollection(
    {
        "IoU": tm.JaccardIndex(**metric_params),
        "DSC": tm.Dice(**metric_params),
        "Spec": tm.Specificity(**metric_params),
        "Sens": tm.Recall(**metric_params),
    },
    prefix="Val/Seg/",
)

Expected behavior

More performant results.

Environment

TorchMetrics version v0.9.3 and v0.10.0
Python 3.8.10, PyTorch 1.12.1+cu116
Linux 5.15.0-1018-gcp #24~20.04.1-Ubuntu SMP Mon Sep 12 06:14:01 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Oct 13 '22 12:10 omerferhatt

I'm seeing something similar, particularly with JaccardIndex.

Oct 13 '22 16:10 tayden

Hi @omerferhatt and @tayden Could one of you provided the precise configuration you are using (num_classes) and what the shape of typical input to the metric looks like? Just to know if you are running into an edge case that we had previously not thought about :]

Oct 13 '22 19:10 SkafteNicki

Absolutely @SkafteNicki

num_classes=5
preds: (8, 5, 256, 256), dtype=float32, min=0, max=1 => Output softmax
target: (8, 256, 256), dtype=int64 => Multi-class labels

Oct 13 '22 20:10 omerferhatt

Similar to @omerferhatt, I have:

iou_metric = JaccardIndex(num_classes=3, ignore_index=2, average="none")
probs = (2, 3, 512, 512), dtype=torch.float32  => softmax outputs
target: (2, 512, 512), dtype=torch.uint8 => multi-class labels (0,1, or 2 as values)

Oct 13 '22 20:10 tayden

I am seeing extreme slowdown with MatthewsCorrCoef too. What used to take less than a second for me now takes 10 minutes! Reverting back to 0.9.0 or 0.8.2 works just fine.

Oct 18 '22 21:10 ductai199x