anomalib icon indicating copy to clipboard operation
anomalib copied to clipboard

Is possible to train only normal_dir on custom dataset?

Open nguyenanhtuan1008 opened this issue 2 years ago • 2 comments

https://github.com/openvinotoolkit/anomalib#custom-dataset

dataset:
 name: <name-of-the-dataset>
 format: folder
 path: <path/to/folder/dataset>
 normal_dir: normal # name of the folder containing normal images.
 abnormal_dir: null # name of the folder containing abnormal images.
 normal_test_dir: null # name of the folder containing normal test images.
 task: segmentation # classification or segmentation
 mask: <path/to/mask/annotations> #optional
 extensions: null
 split_ratio: 0.2  # ratio of the normal images that will be used to create a test split
 image_size: 256
 train_batch_size: 32
 test_batch_size: 32
 num_workers: 8
 transform_config:
   train: null
   val: null
 create_validation_set: true
 tiling:
   apply: false
   tile_size: null
   stride: null
   remove_border_count: 0
   use_random_tiling: False
   random_tile_count: 16

If someone gives only normal images, it means I have only normal_dir and would like to do train.py with only normal images. And after training, I will provide this model to them to predict with their dataset. Is it possible to do it? When I try to set

normal_dir: normal
abnormal: null

then it didn't work.

When I read the notebook for MVTec https://github.com/openvinotoolkit/anomalib/blob/development/notebooks/100_datamodules/102_mvtec.ipynb they trained only train/ folder and testing in separated test/ folder. So I think in custom data we could also do the same. Am I wrong thinking?

nguyenanhtuan1008 avatar Jul 31 '22 06:07 nguyenanhtuan1008

plz read notebook carefully

BinglunWang avatar Aug 02 '22 23:08 BinglunWang

@CoolPandaWang I read it and I know we need normal data + abnormal data.

But just a stupid question is: Do we need some abnormal data to train the model? Can not use just normal images?

nguyenanhtuan1008 avatar Aug 05 '22 16:08 nguyenanhtuan1008

Hi, thanks for your question and apologies for the late response. Unfortunately we do not currently support training without abnormal examples. The reason for this is that Anomalib always performs a validation stage after each epoch, and a testing stage after training is complete. For this it requires at least a few anomalous images.

While we do see the added value of running Anomalib on datasets without anomalous images, we haven't had the chance to add this functionality to the library so far. One of the main problems is that we need the anomalous images to compute the adaptive value of the anomaly score threshold, which is computed by maximizing the F1 score over the validation set. Without anomalous images we wouldn't be able to use this adaptive threshold functionality, which would make it difficult for our models to generate accurate predictions. On the upside: We are actively working on an unsupervised thresholding mechanism that eliminates the need for anomalous images. Once we have added this to the library, we could also allow training on datasets without anomalous images. So please check again later!

djdameln avatar Sep 02 '22 15:09 djdameln