anomalib icon indicating copy to clipboard operation
anomalib copied to clipboard

🐞 Cannot update metric with FastFlow on custom folder dataset

Open haimat opened this issue 5 months ago • 6 comments

Describe the bug

I have a custom dataset with normal (in the "good" folder) and abnormal (in the "bad" folder) images. Now I want to train a FastFlow model on it for a classification task. This is the code to ddefine the dataset and train the model:

# Set input size for the model
input_size = (config["img_width"], config["img_height"])
transform_pipeline = [transforms.Resize(input_size)]
transform = transforms.Compose(transform_pipeline)

# Create dataset from folder
datamodule = AnomalibFolder(
    name="My Dataset",
    root=str(anomalib_dir),
    normal_dir="good",
    abnormal_dir="bad",
    train_batch_size=config["batch_size"],
    eval_batch_size=config["batch_size"],
    augmentations=transform,
    num_workers=config["workers"],
    test_split_mode=TestSplitMode.FROM_DIR,
    val_split_ratio=config["val_split_ratio"],
    val_split_mode=ValSplitMode.FROM_TRAIN,
    test_split_ratio=config["test_split_ratio"],
)

# Import the model class dynamically based on the user selection
model = _get_model_class_by_name(train.model_name)()

# Train the model using the Anomalib Engine
engine = Engine(
    max_epochs=config["epochs"],
    accelerator="gpu",
    devices=-1,
    callbacks=callbacks,
    strategy="ddp_find_unused_parameters_true",
)
engine.fit(datamodule=datamodule, model=model)

However, after the first epoch I get this error:

Traceback (most recent call last):
  File "/home/mfb/.local/lib/python3.10/site-packages/anomalib/engine/engine.py", line 416, in fit
    self.trainer.fit(model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 561, in fit
    call._call_and_handle_interrupt(
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 47, in _call_and_handle_interrupt
    return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/strategies/launchers/subprocess_script.py", line 105, in launch
    return function(*args, **kwargs)
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 599, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 1012, in _run
    results = self._run_stage()
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 1056, in _run_stage
    self.fit_loop.run()
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py", line 216, in run
    self.advance()
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py", line 455, in advance
    self.epoch_loop.run(self._data_fetcher)
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/training_epoch_loop.py", line 153, in run
    self.on_advance_end(data_fetcher)
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/training_epoch_loop.py", line 394, in on_advance_end
    self.val_loop.run()
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/utilities.py", line 179, in _decorator
    return loop_run(self, *args, **kwargs)
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 145, in run
    self._evaluation_step(batch, batch_idx, dataloader_idx, dataloader_iter)
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 451, in _evaluation_step
    call._call_callback_hooks(trainer, hook_name, output, *hook_kwargs.values())
  File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 227, in _call_callback_hooks
    fn(trainer, trainer.lightning_module, *args, **kwargs)
  File "/home/mfb/.local/lib/python3.10/site-packages/anomalib/metrics/evaluator.py", line 142, in on_validation_batch_end
    metric.update(batch)
  File "/home/mfb/.local/lib/python3.10/site-packages/torchmetrics/metric.py", line 482, in wrapped_func
    update(*args, **kwargs)
  File "/home/mfb/.local/lib/python3.10/site-packages/anomalib/metrics/base.py", line 176, in update
    raise ValueError(msg)
ValueError: Cannot update metric of type <class 'anomalib.metrics.auroc.AUROC'>. Passed dataclass instance does not have a value for field with name gt_mask.

The whole code does work without that issue when I use ReverseDistillation as model. So how can I train a FastFlow model with a custom folder dataset in anomalib 2.0.0?

Dataset

Folder

Model

FastFlow

OS information

OS information:

  • OS: Ubuntu 22.04
  • Python version: 3.10.12
  • Anomalib version: 2.0.0
  • PyTorch version: 2.5.1
  • CUDA/cuDNN version: 12.4
  • GPU models and configuration: Nvidia GeForce RTX 2070
  • Any other relevant information: lightning-2.5.2

Pip/GitHub

pip

What version/branch did you use?

2.0.0

Code of Conduct

  • [x] I agree to follow this project's Code of Conduct

haimat avatar Jul 23 '25 14:07 haimat

Thanks for submitting this issue! It has been added to our triage queue. A maintainer will review it shortly.

github-actions[bot] avatar Jul 23 '25 14:07 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Oct 22 '25 05:10 github-actions[bot]

you need to modify fastflow\lightning_model.py configure_evaluator function line 205-214,add flag : strict=False,In this way, when the verification is missing gt_mask it will not be strictly evaluated and errors will not be reported error. I verified on 2.3.0-dev that it works.

origin:

image_auroc = AUROC(fields=["pred_score", "gt_label"], prefix="image_")
pixel_auroc = AUROC(fields=["anomaly_map", "gt_mask"], prefix="pixel_")
val_metrics = [image_auroc, pixel_auroc]

# test_metrics
image_auroc = AUROC(fields=["pred_score", "gt_label"], prefix="image_")
image_f1score = F1Score(fields=["pred_label", "gt_label"], prefix="image_")
pixel_auroc = AUROC(fields=["anomaly_map", "gt_mask"], prefix="pixel_")
pixel_f1score = F1Score(fields=["pred_mask", "gt_mask"], prefix="pixel_")

modify:

image_auroc = AUROC(fields=["pred_score", "gt_label"], prefix="image_", strict=False)
pixel_auroc = AUROC(fields=["anomaly_map", "gt_mask"], prefix="pixel_", strict=False)
val_metrics = [image_auroc, pixel_auroc]

# test_metrics
image_auroc = AUROC(fields=["pred_score", "gt_label"], prefix="image_", strict=False)
image_f1score = F1Score(fields=["pred_label", "gt_label"], prefix="image_", strict=False)
pixel_auroc = AUROC(fields=["anomaly_map", "gt_mask"], prefix="pixel_", strict=False)
pixel_f1score = F1Score(fields=["pred_mask", "gt_mask"], prefix="pixel_", strict=False)

JockerLin avatar Oct 24 '25 02:10 JockerLin

interesting! Good find @JockerLin! @rajeshgangireddy, @ashwinvaidya17 would it be possible to validate this, to see if this is worth updating the code base?

samet-akcay avatar Oct 24 '25 06:10 samet-akcay

I met the same problem. @JockerLin solved it! Do you plan to fix this in the next release, @samet-akcay ?

tailyer avatar Nov 14 '25 09:11 tailyer

thanks for the follow-up @tailyer

Not sure if the team has validated this yet. Would you one of you be interested in creating a PR for contribution? @tailyer, @JockerLin ?

samet-akcay avatar Nov 14 '25 09:11 samet-akcay