🐞 Cannot update metric with FastFlow on custom folder dataset
Describe the bug
I have a custom dataset with normal (in the "good" folder) and abnormal (in the "bad" folder) images. Now I want to train a FastFlow model on it for a classification task. This is the code to ddefine the dataset and train the model:
# Set input size for the model
input_size = (config["img_width"], config["img_height"])
transform_pipeline = [transforms.Resize(input_size)]
transform = transforms.Compose(transform_pipeline)
# Create dataset from folder
datamodule = AnomalibFolder(
name="My Dataset",
root=str(anomalib_dir),
normal_dir="good",
abnormal_dir="bad",
train_batch_size=config["batch_size"],
eval_batch_size=config["batch_size"],
augmentations=transform,
num_workers=config["workers"],
test_split_mode=TestSplitMode.FROM_DIR,
val_split_ratio=config["val_split_ratio"],
val_split_mode=ValSplitMode.FROM_TRAIN,
test_split_ratio=config["test_split_ratio"],
)
# Import the model class dynamically based on the user selection
model = _get_model_class_by_name(train.model_name)()
# Train the model using the Anomalib Engine
engine = Engine(
max_epochs=config["epochs"],
accelerator="gpu",
devices=-1,
callbacks=callbacks,
strategy="ddp_find_unused_parameters_true",
)
engine.fit(datamodule=datamodule, model=model)
However, after the first epoch I get this error:
Traceback (most recent call last):
File "/home/mfb/.local/lib/python3.10/site-packages/anomalib/engine/engine.py", line 416, in fit
self.trainer.fit(model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 561, in fit
call._call_and_handle_interrupt(
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 47, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/strategies/launchers/subprocess_script.py", line 105, in launch
return function(*args, **kwargs)
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 599, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 1012, in _run
results = self._run_stage()
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 1056, in _run_stage
self.fit_loop.run()
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py", line 216, in run
self.advance()
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py", line 455, in advance
self.epoch_loop.run(self._data_fetcher)
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/training_epoch_loop.py", line 153, in run
self.on_advance_end(data_fetcher)
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/training_epoch_loop.py", line 394, in on_advance_end
self.val_loop.run()
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/utilities.py", line 179, in _decorator
return loop_run(self, *args, **kwargs)
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 145, in run
self._evaluation_step(batch, batch_idx, dataloader_idx, dataloader_iter)
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 451, in _evaluation_step
call._call_callback_hooks(trainer, hook_name, output, *hook_kwargs.values())
File "/home/mfb/.local/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 227, in _call_callback_hooks
fn(trainer, trainer.lightning_module, *args, **kwargs)
File "/home/mfb/.local/lib/python3.10/site-packages/anomalib/metrics/evaluator.py", line 142, in on_validation_batch_end
metric.update(batch)
File "/home/mfb/.local/lib/python3.10/site-packages/torchmetrics/metric.py", line 482, in wrapped_func
update(*args, **kwargs)
File "/home/mfb/.local/lib/python3.10/site-packages/anomalib/metrics/base.py", line 176, in update
raise ValueError(msg)
ValueError: Cannot update metric of type <class 'anomalib.metrics.auroc.AUROC'>. Passed dataclass instance does not have a value for field with name gt_mask.
The whole code does work without that issue when I use ReverseDistillation as model.
So how can I train a FastFlow model with a custom folder dataset in anomalib 2.0.0?
Dataset
Folder
Model
FastFlow
OS information
OS information:
- OS: Ubuntu 22.04
- Python version: 3.10.12
- Anomalib version: 2.0.0
- PyTorch version: 2.5.1
- CUDA/cuDNN version: 12.4
- GPU models and configuration: Nvidia GeForce RTX 2070
- Any other relevant information: lightning-2.5.2
Pip/GitHub
pip
What version/branch did you use?
2.0.0
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
Thanks for submitting this issue! It has been added to our triage queue. A maintainer will review it shortly.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
you need to modify fastflow\lightning_model.py configure_evaluator function line 205-214,add flag : strict=False,In this way, when the verification is missing gt_mask it will not be strictly evaluated and errors will not be reported error. I verified on 2.3.0-dev that it works.
origin:
image_auroc = AUROC(fields=["pred_score", "gt_label"], prefix="image_")
pixel_auroc = AUROC(fields=["anomaly_map", "gt_mask"], prefix="pixel_")
val_metrics = [image_auroc, pixel_auroc]
# test_metrics
image_auroc = AUROC(fields=["pred_score", "gt_label"], prefix="image_")
image_f1score = F1Score(fields=["pred_label", "gt_label"], prefix="image_")
pixel_auroc = AUROC(fields=["anomaly_map", "gt_mask"], prefix="pixel_")
pixel_f1score = F1Score(fields=["pred_mask", "gt_mask"], prefix="pixel_")
modify:
image_auroc = AUROC(fields=["pred_score", "gt_label"], prefix="image_", strict=False)
pixel_auroc = AUROC(fields=["anomaly_map", "gt_mask"], prefix="pixel_", strict=False)
val_metrics = [image_auroc, pixel_auroc]
# test_metrics
image_auroc = AUROC(fields=["pred_score", "gt_label"], prefix="image_", strict=False)
image_f1score = F1Score(fields=["pred_label", "gt_label"], prefix="image_", strict=False)
pixel_auroc = AUROC(fields=["anomaly_map", "gt_mask"], prefix="pixel_", strict=False)
pixel_f1score = F1Score(fields=["pred_mask", "gt_mask"], prefix="pixel_", strict=False)
interesting! Good find @JockerLin! @rajeshgangireddy, @ashwinvaidya17 would it be possible to validate this, to see if this is worth updating the code base?
I met the same problem. @JockerLin solved it! Do you plan to fix this in the next release, @samet-akcay ?
thanks for the follow-up @tailyer
Not sure if the team has validated this yet. Would you one of you be interested in creating a PR for contribution? @tailyer, @JockerLin ?