anomalib
anomalib copied to clipboard
Allow custom Data Normalization and consider it in Post-processing
Is your feature request related to a problem? Please describe.
By default (without warning the user!), if no transform_config_{train,val}
is supplied to the datamodule, the code uses ImageNet statistics to normalize the images, which may be a good default for the included datasets, but should print some warning to the developer on custom data.
At the Post-Processing step, e.g. when visualizing the images using the ImageVisualizerCallback
, Denormalize()(batch["image"][i].cpu())
is applied in Visualizer.visualize_batch
(anomalib/anomalib/post_processing/visualizer.py:89), which again defaults to the ImageNet statistics. But unlike the input datamodule, the visualizer cannot be configured to use the normalization-configuration as configured in the input-dataset, which leads to the images being "denormalized" to different ranges so that $denorm(norm(image)) \ne image$
Describe the solution you'd like
- Add parameter to Visualizer to configure Normalization
- Or: automatically set parameters on callback when
transform_config_{train,test} != None
is discovered in datamodule - Warn user (at least for custom datasets), that by default, ImageNet Normalization is applied, when
transform_config_{train,val} is None
Describe alternatives you've considered
- ImageNet normalization may be great to reproduce paper results, but when used in other projects, custom input data normalization is common.
Additional context
My suggestion to solving this issue is to save the (validation) preprocessing pipeline to the model so it can be loaded from the model file for inferencing. This can be done, if the config always contains the default transformations at least. They will be saved as hyperparameters in the model file.
Thanks for pointing this out, I do agree with your observations. We will discuss internally what would be the best way to address these points and then assign someone from our team to work on it.