anomalib icon indicating copy to clipboard operation
anomalib copied to clipboard

🚀 feat(model): add GLASS model into Anomalib

Open code-dev05 opened this issue 9 months ago • 18 comments

📝 Description

  • This PR will add the GLASS model for synthesis of anomalies on a global and local level.
  • 🛠️ Fixes #2619

✨ Changes

Select what type of change your PR is:

  • [x] 🚀 New feature (non-breaking change which adds functionality)
  • [x] 📚 Documentation update

TODO

  • [x] Implement the model from this research paper.
  • [x] Write a trainer for the above model.
  • [x] Add comments to the code for easier understanding.
  • [ ] Write test cases for verification.
  • [ ] Update documentation for the above model.
  • [ ] Any optimizations if needed?

I will keep updating the PR as I implement the rest of the list.

code-dev05 avatar Mar 26 '25 10:03 code-dev05

@samet-akcay I am facing a problem. I need to add perlin noise to the image for training as done here in the getitem method. Should i add it to the image during the training step or should I make a dict before the training step as in the above link? For the second way, I think I would need to make a new dataset loader.

code-dev05 avatar Apr 11 '25 09:04 code-dev05

@code-dev05, you could use Anomalib's PreProcessor class. The advantage of the preprocessor is that you could apply any transform to the model input, including perlin noise. You could set train,val, test transforms separately, or all at once.

Anomalib also provides PerlinAnomalyGenerator class that you could create as a transform and pass into PreProcessor

Let me know if the documentation is not clear, or you would need more clarification.

samet-akcay avatar Apr 12 '25 05:04 samet-akcay

Thank you for the advice. I will look into it and get back to you if I do not understand something.

code-dev05 avatar Apr 12 '25 14:04 code-dev05

Alternatively, you could use perlin_noise_generator function in on_train_batch_start method in your Glass implementation that inherits AnomalibModule

What this would do is to apply the perlin noise when the model receives the input on batch start.

You could prototype any one of them. We could always polish it later on once your pr is ready for review

samet-akcay avatar Apr 13 '25 05:04 samet-akcay

@samet-akcay I have created one new commit. Please check it. I have noticed a few issues,

  • One is computing the center of the dataset. Since we are using the Batch dataclass, in one training step we only compute the outputs for one batch, and for the center we need the whole dataset. One solution for this would be to somehow calculate the center for the whole dataset during initialisation and other way would be to calculate the center of the batch for that training step. This approach would be more inaccurate than the other.
  • Second is the the use of two different datasets. In the original code, they have used DTD dataset along with MVTec for different textures. How to implement that? In my code I have used batch.image and batch.aug so I am assuming we are storing both the image and the texture in the object. I think we have to make a new dataclass for this? If so, we can directly integrate the perlin noise in the image instead of during training step. Thank you for answering my queries.

code-dev05 avatar Apr 14 '25 15:04 code-dev05

@samet-akcay I will also need to create a datamodule for the DTD dataset. Should I add that in this PR only or create a new PR?

code-dev05 avatar Apr 19 '25 06:04 code-dev05

Hi @code-dev05

Thanks a lot for your effort! Here are some comments that may help you address your issues related to center computation and noise generation:

  • One is computing the center of the dataset. Since we are using the Batch dataclass, in one training step we only compute the outputs for one batch, and for the center we need the whole dataset. One solution for this would be to somehow calculate the center for the whole dataset during initialisation and other way would be to calculate the center of the batch for that training step. This approach would be more inaccurate than the other.

We want to reproduce the algorithm from the original paper as closely as possible, so I would suggest to compute the center of the dataset in an initialization step. You could try using the setup or on_fit_start/on_train_start lightning hooks for this (see lightning docs for further reading on model hooks).

  • Second is the the use of two different datasets. In the original code, they have used DTD dataset along with MVTec for different textures. How to implement that? In my code I have used batch.image and batch.aug so I am assuming we are storing both the image and the texture in the object. I think we have to make a new dataclass for this? If so, we can directly integrate the perlin noise in the image instead of during training step.

Looking at your code, it is not entirely clear to me how you are intending to use batch.aug. The current (Image)-Batch class does not contain the aug key. Were you planning on adding it? This would involve fundamental changes to Anomalib's dataset and dataclasses interface, which I'm not sure is the way to go here.

My personal preference would be to apply the augmentation within the training_step. This is similar to how Perlin noise augmentations are used in DRAEM model (see here). In my view, this is the most flexible approach, as the training step retains access to both the original and the augmented image. At the same time, we ensure that the augmentation is always applied before passing the images to the model during training (contrary to adding the noise as an augmentation step on the dataset side, which could lead to problems if the dataset is not configured correctly by the user).

I will also need to create a datamodule for the DTD dataset. Should I add that in this PR only or create a new PR?

This might not be necessary. As far as I understand (but please correct me if I'm wrong), we only use the DTD dataset as a source of textures for the synthetic anomaly generation step. This means that we don't need any ground truth information for this dataset, just the raw images. So instead of implementing a dedicated dataset/datamodule, we can just randomly sample some images that we read from the file system in every step. In fact, this mechanism has already been implemented for DRAEM, which uses a similar approach. Please have a look at this class to see if you can use this for the augmentation step of your GLASS model (note that you can just pass the root of the DTD dataset on your file system, and the augmenter will automatically sample and read the images).

djdameln avatar Apr 23 '25 17:04 djdameln

Looking at your code, it is not entirely clear to me how you are intending to use batch.aug. The current (Image)-Batch class does not contain the aug key. Were you planning on adding it? This would involve fundamental changes to Anomalib's dataset and dataclasses interface, which I'm not sure is the way to go here.

Yes, I was planning on adding a new dataclass interface but i will see the way it is done in the DRAEM model and try to get the same done here and will add the usage of the hooks for computing the center as soon as possible.

I will also make the suggested changes as soon as possible.

Thanks for the help.

code-dev05 avatar Apr 24 '25 09:04 code-dev05

@samet-akcay @ashwinvaidya17 @djdameln The PR is ready for review. Can you please review it?

code-dev05 avatar May 07 '25 05:05 code-dev05

@code-dev05 Also tests are failing. Do you have access to the test results ?

samet-akcay avatar May 07 '25 13:05 samet-akcay

Yes, I can see them. I will address those also and see what I can do to solve them.

code-dev05 avatar May 07 '25 13:05 code-dev05

Yes, I can see them. I will address those also and see what I can do to solve them.

Great thanks

samet-akcay avatar May 07 '25 13:05 samet-akcay

@samet-akcay The pre-commit hook is giving error in the line 202 of the lightning_model.py file. It says that .shape is not an attribute of item type any | none. But I am sure the variable img will never be none and will always have the shape attribute. How do i tackle this?

code-dev05 avatar May 13 '25 15:05 code-dev05

Hi @code-dev05 . I'm interested in trying out this implementation and possibly provide some feedback or fixes. Does it make sense to give this a go already or is it better to wait?

hgaiser avatar Jul 01 '25 12:07 hgaiser

hi @hgaiser , thank you for your interest in this implementation. I would suggest waiting for sometime until all the issues are fixed and the pr is merged.

code-dev05 avatar Jul 01 '25 13:07 code-dev05

hi @hgaiser , thank you for your interest in this implementation. I would suggest waiting for sometime until all the issues are fixed and the pr is merged.

Alright, thanks for replying. Let me know if you need an extra pair of hands on this.

hgaiser avatar Jul 01 '25 18:07 hgaiser

Can you also test if the model runs via the CLI?

ashwinvaidya17 avatar Jul 15 '25 09:07 ashwinvaidya17

Oh and also include a config file with default parameters. https://github.com/open-edge-platform/anomalib/tree/main/examples/configs/model

ashwinvaidya17 avatar Aug 01 '25 11:08 ashwinvaidya17