ignite icon indicating copy to clipboard operation
ignite copied to clipboard

Fixed test in test_precision_recall_curve and tests/ignite/contrib/metrics/regression

Open sayantan1410 opened this issue 2 years ago • 19 comments

Related to #2490

I couldn't figure out why the test was passing initially, I probably needed to gather data from all the processes with idist.all_gather. But as per the conversation in the mentioned PR I have made the changes to it. Let me know if anything needs to be I will be happy to fix it.

sayantan1410 avatar Mar 11 '22 05:03 sayantan1410

I couldn't figure out why the test was passing initially, I probably needed to gather data from all the processes with idist.all_gather.

@sayantan1410 PR seems good. However we need to understand why the tests were passing previously. Please fetch all the data and see why it was passing. Thanks

vfdev-5 avatar Mar 11 '22 10:03 vfdev-5

I think there are some others tests broken like this one in regression.

sdesrozis avatar Mar 11 '22 18:03 sdesrozis

@vfdev-5 Yeah trying to look for the reason !!

sayantan1410 avatar Mar 12 '22 04:03 sayantan1410

@sdesrozis Should I change them in this PR only ?

sayantan1410 avatar Mar 12 '22 04:03 sayantan1410

@sayantan1410 It would be great if you could look for others corrupted tests and fix them if you find any. I would say it's mainly in contrib/metrics/regression. You will just have to rename the PR if others fixes are done.

Thanks a lot for your help !

sdesrozis avatar Mar 12 '22 07:03 sdesrozis

Yeah sure will do that.

sayantan1410 avatar Mar 12 '22 08:03 sayantan1410

Probable reason why https://github.com/pytorch/ignite/blob/2eb3bcf4949821a3c894a21a781c5a34bb5545e2/tests/ignite/contrib/metrics/test_precision_recall_curve.py#L196 is passing inspite of being wrong : There is a difference is value between precision, recall, thresholds and sk_precision, sk_recall and sk_thresolds in a distributed configuration, however, the difference was small and pytest.approx was taking care of it so that tests were passing. Code to reproduce:

import torch
import ignite.distributed as idist
from ignite.exceptions import NotComputableError
from ignite.metrics import EpochMetric
from ignite.contrib.metrics import PrecisionRecallCurve
from sklearn.metrics import precision_recall_curve
from ignite.engine import Engine


rank = idist.get_rank()
torch.manual_seed(12)
device = idist.device()

def _test(n_epochs, metric_device):
    metric_device = torch.device(metric_device)
    n_iters = 80
    size = 151
    offset = n_iters * size
    y_true = torch.randint(0, 2, size=(offset * idist.get_world_size(),)).to(device)
    y_preds = torch.randint(0, 2, size=(offset * idist.get_world_size(),)).to(device)

    def update(engine, i):
          return (
              y_preds[i * size + rank * offset : (i + 1) * size + rank * offset],
              y_true[i * size + rank * offset : (i + 1) * size + rank * offset],
          )
    engine = Engine(update)

    prc = PrecisionRecallCurve(device=metric_device)
    prc.attach(engine, "prc")

    data = list(range(n_iters))
    engine.run(data=data, max_epochs=n_epochs)

    assert "prc" in engine.state.metrics

    precision, recall, thresholds = engine.state.metrics["prc"]

    np_y_true = y_true.cpu().numpy().ravel()
    np_y_preds = y_preds.cpu().numpy().ravel()

    sk_precision, sk_recall, sk_thresholds = precision_recall_curve(np_y_true, np_y_preds)
    print(f"precision: {precision}")
    print(f"recall: {recall}")
    print(f"thresholds: {thresholds}")
    print(f"sk_precision: {sk_precision}")
    print(f"sk_recall: {sk_recall}")
    print(f"sk_thresholds: {sk_thresholds}")

metric_devices = ["cpu"]
if device.type != "xla":
    metric_devices.append(idist.device())
for metric_device in metric_devices:
    for _ in range(1):
        print("new test")
        _test(n_epochs=1, metric_device=metric_device)

Command I used to run the code:

torchrun --nproc_per_node 2 filename.py --backend prefered backend 

sayantan1410 avatar Mar 13 '22 14:03 sayantan1410

@vfdev-5 @sdesrozis Hey probably the tests for the metrics in contrib/metrics/regression are fixed, can you check once.

sayantan1410 avatar Mar 13 '22 14:03 sayantan1410

@sayantan1410 your repro code example does not reproduce the issue and does not work with DDP as there is no communication group created.

vfdev-5 avatar Mar 14 '22 10:03 vfdev-5

@vfdev-5

your repro code example does not reproduce the issue and does not work with DDP as there is no communication group created.

Yeah, I realized this now. Also, can you once review the changes in this PR, it would be helpful for me.

sayantan1410 avatar Mar 14 '22 17:03 sayantan1410

@vfdev-5 @sdesrozis this was the change I was thinking about, please check once.

sayantan1410 avatar Mar 20 '22 13:03 sayantan1410

@vfdev-5 yeah, now I understand it better than at that time :) Also, what to do to make the tests uniform ?

sayantan1410 avatar Mar 20 '22 16:03 sayantan1410

@vfdev-5 Hey, Any idea why the hvd test is failling, other tests are passing now.

sayantan1410 avatar Mar 21 '22 08:03 sayantan1410

@sayantan1410 looks like majority of tests are failing and not only hvd... Check

  • https://github.com/pytorch/ignite/runs/5618074267?check_suite_focus=true
  • https://github.com/pytorch/ignite/runs/5618074236?check_suite_focus=true#step:9:1624

vfdev-5 avatar Mar 21 '22 10:03 vfdev-5

@vfdev-5 Hey, that is of the old tests, I have made a commit after that and it is showing me that the tests are passing except for hvd. Probably you need to refresh the page once.

sayantan1410 avatar Mar 21 '22 12:03 sayantan1410

Oh, I see, sorry. I rerun the tests, output looks strange

vfdev-5 avatar Mar 21 '22 12:03 vfdev-5

@vfdev-5 Hey, Getting assertion error if I am not increasing the tolerance, Any idea how should I fix this ?

sayantan1410 avatar Mar 23 '22 06:03 sayantan1410

@sayantan1410 can we split this PR such that improved tests that are passing could be merged and those are not passing we could investigate case by case. Otherwise you have a stuck block of modifications...

vfdev-5 avatar Apr 09 '22 15:04 vfdev-5

@vfdev-5 I could do that, basically, only the precision_recall_curve is working, if possible can we have a meet once, I can show you where I am stuck, and we can solve the issue may be.

sayantan1410 avatar Apr 09 '22 16:04 sayantan1410

Closing this PR in favor of #2655. Thanks for your work @sayantan1410 !

vfdev-5 avatar Sep 04 '22 15:09 vfdev-5