alibi-detect icon indicating copy to clipboard operation
alibi-detect copied to clipboard

MMDDriftOnline keeps detecting drift every 5 frames or so?

Open nghible opened this issue 2 years ago • 7 comments

Hi,

I am having trouble getting sensible results from the MMDDriftOnline class. This could be because of the hyperparameter configurations of ERT or Window size but I am not completely sure, as I am not too familiar with some of the methods used in this algorithm. But I would really appreciate it if someone can point me to the right direction.

So I have a reference data set of 1800 images, taken sequentially from a video clip. The preprocess function I used is an untrained encoder, basically just an untrained CNN. And then I initialized the MMDDriftOnline detector with said preprocessing function and ERT=150 and Window size=24. To test the algorithm, I then refed the same clip frame-by-frame into the detector. In principle, drift should not have been detected until about after 150 frames? But the algorithm detected drift after about 5 frames. I calculated the average number of frames before drifts are detected and it averaged to about 4.5.

I have tried various different combinations of the Windowsize and ERT. The Windowsize of 150 pushed the average number of frames before dirft is detected up to about 10 frames which is still far from 150.

Can someone help me understand why is this the case? Thank you!

nghible avatar Jul 06 '22 18:07 nghible

Hi @nghible . Your reference data contains 1800 images. In order for the drift detector to not flag drift, the test window is expected to form an i.i.d. sample from the same underlying distribution as the reference data. But since the test window size is only equal to 24, it is very unlikely that it covers the same distribution as the much richer (1800 vs. 24 instances) reference data. Importantly, since the data are taken sequentially from a video clip, there will be high autocorrelation and similarity between consecutive instances. If the test data is sampled in a similar way, it is very unlikely that it would constitute an i.i.d. sample coming from the same underlying distribution as the full reference data and therefore drift would (correctly) be flagged. Of course I do not know your data, but a sensible thing to do is to ask yourself at which point you would intuitively expect the test window to cover roughly the full reference data distribution, taking the autocorrelation issues into account. A second potential issue could be the preprocessing step (although I believe this is not the primary issue here). While an untrained CNN might work, ideally you would have a trained (auto)encoder, which would map the input onto the lower dimensional latent space. Note that the context-aware detector could be helpful in your case since it can take relevant context to condition the detector on into account and can get around the non-i.i.i.d-ness of your data (due to sequential instances). Hope this was helpful!

arnaudvl avatar Jul 07 '22 09:07 arnaudvl

Hi @arnaudvl, Thank you for responding! I have a few follow-up questions:

  1. The 1800 images I used for the reference data are the entire video clip itself. So when I re-fed the same video-clip frame-by-frame (sequentially from frame 1 to frame 1800 so yes there is autocorrelation) into the algorithm, the frames that were fed in are also the reference data. If this is the case then does it make sense if the algorithm still detects drift when the frames fed in are the same as the frames in the reference data?
  2. My application does have a lot of autocorrelated frames since I work with very long videos taken from stationary cameras. According to what you said, MMDDriftOnline will not work well for me, and I should look at the context-aware detector? Since if the system the deploying for a real application, then the new test frames that comes in will certainly be autocorrelated. But I guess I can sample to create the reference data in a way that lower the autocorrelation of the reference data? Will it help if I create a diverse set of reference data say from various different timepoints along the video?
  3. Regarding pre-processing, I used the untrained encoder because an Alibi-detect tutorial used it as well and cited a paper called "Failing Loudly" that said this method has equivalent results as a trained auto-encoder? Do you guys think that a trained autoencoder is better for my case?

Thank you so much!

nghible avatar Jul 07 '22 18:07 nghible

  1. Yes, if you feed in the same frames but only a small subset of those in the reference data, drift is very likely to be flagged. This is because the detector will compare the small (e.g. a window of 24 instances) test set with all the 1800 instances in the reference set. Let's say instances 1 -> 1800 are in the reference set and 1 -> 24 in the test set. In this case the similarity will be very high between 1 -> 24 in the reference set and the full test set. Given the autocorrelation the similarity might still be high for e.g. 25 -> 50 (just making up some numbers here) and the test set. But as the video continues and changes more and more, the underlying distribution might become very different for say instances 50 -> 1800 and as a result, drift is flagged when comparing 1 -> 1800 in the reference set with the test set which might be equal to 1 -> 24 of the reference set.
  2. The key is that under the "no drift" scenario, you would expect the test data to form an i.i.d. sample from the same underlying distribution as the reference data. But since you want to do this in real time, you have to take in sequential instances in the test window and would not expect this i.i.d. assumption to hold. For instance your reference data might have images taken throughout various lighting conditions (day, night, cloudy etc) while your test window will come from a specific time during the day, only consisting of a specific lighting condition. As a result, you would always expect your detector to flag drift. However the context-aware detector can take this into account by for instance conditioning on the time of day. This allows you to keep your diverse, sequential reference set as well as test on consecutive instances. So the autocorrelation is taken into account. You could for instance condition on the time of day. Check this example for more info or reach out for more specific help for your use case.
  3. If you can use a trained autoencoder, I would expect it to perform better.

arnaudvl avatar Jul 08 '22 08:07 arnaudvl

Ah, now I see how the algorithm works. Thank you for the answer. However, I am still a bit confused at the moment regarding how the context-aware detector can help with taking autocorrelation into account.

However the context-aware detector can take this into account by for instance conditioning on the time of day. This allows you to keep your diverse, sequential reference set as well as test on consecutive instances. So the autocorrelation is taken into account.

I understand the conditioning on the time of day such as having a conditional variable that denotes day or night for the new data. But doesn't this just mean that the algorithm will take into account the natural shift in distribution between the daytime and the nighttime? I still can't quite grasp how this is related to the autocorrelation between the consecutive frames.

For example, if I have two reference video clips. The first reference clip has a variable denoted that it is a nighttime dataset and the second reference clip has a variable denoted that it is a daytime dataset. Because they are video clips, the frames are sequential and autocorrelated, as you said in the quote above. If I input these reference clips to initialize the context-aware detector, and the new clip that is coming in is a daytime clip with the context variable denoted daytime. Wouldn't the new autocorrelated frames be compared to the entire daytime-denoted reference clip, which also has autocorrelated frames, and we are back to the same problem? Or am I misunderstanding what you meant?

Another example, say if I sample the reference clips at the same time period (say between 9:00AM to 11:00AM, and randomly chose 10 frames every minute) but over the course of multiple weeks. And then mixing them up all together into one reference dataset so as to minimize autocorrelation and denoted the context as daytime. Then won't the new autocorrelated frames that are coming in will be flagged drift because the reference dataset has much less autocorrelation?

Thank you!

nghible avatar Jul 11 '22 20:07 nghible

Just a follow-up, I now realized that my current application's ML models do not actually operate on the temporal dimension. They treat every frame as an independent data point. So for my case, do you think I can maybe use another drift detection algorithm that does not treat the data points in temporal order so as to sidestep this problem of auto-correlation? Thank you @arnaudvl !

nghible avatar Jul 12 '22 17:07 nghible

Ah, now I see how the algorithm works. Thank you for the answer. However, I am still a bit confused at the moment regarding how the context-aware detector can help with taking autocorrelation into account.

However the context-aware detector can take this into account by for instance conditioning on the time of day. This allows you to keep your diverse, sequential reference set as well as test on consecutive instances. So the autocorrelation is taken into account.

I understand the conditioning on the time of day such as having a conditional variable that denotes day or night for the new data. But doesn't this just mean that the algorithm will take into account the natural shift in distribution between the daytime and the nighttime? I still can't quite grasp how this is related to the autocorrelation between the consecutive frames.

For example, if I have two reference video clips. The first reference clip has a variable denoted that it is a nighttime dataset and the second reference clip has a variable denoted that it is a daytime dataset. Because they are video clips, the frames are sequential and autocorrelated, as you said in the quote above. If I input these reference clips to initialize the context-aware detector, and the new clip that is coming in is a daytime clip with the context variable denoted daytime. Wouldn't the new autocorrelated frames be compared to the entire daytime-denoted reference clip, which also has autocorrelated frames, and we are back to the same problem? Or am I misunderstanding what you meant?

Another example, say if I sample the reference clips at the same time period (say between 9:00AM to 11:00AM, and randomly chose 10 frames every minute) but over the course of multiple weeks. And then mixing them up all together into one reference dataset so as to minimize autocorrelation and denoted the context as daytime. Then won't the new autocorrelated frames that are coming in will be flagged drift because the reference dataset has much less autocorrelation?

Thank you!

The time of day context variable was just an example to illustrate the underlying method, not necessarily one right for your application. Let me have a think about what more relevant context variables could look like for your application.

arnaudvl avatar Aug 03 '22 10:08 arnaudvl

Just a follow-up, I now realized that my current application's ML models do not actually operate on the temporal dimension. They treat every frame as an independent data point. So for my case, do you think I can maybe use another drift detection algorithm that does not treat the data points in temporal order so as to sidestep this problem of auto-correlation? Thank you @arnaudvl !

I might be wrong, but I don't think so. The model will treat each image individually, but still sees the instances one after the other, right? So there is still the autocorrelation between the individual frames which are fed to the model. Let me know if I misunderstood something and apologies for the late reply!

arnaudvl avatar Aug 03 '22 10:08 arnaudvl

Closing for now. Please feel free to open if any follow-up questions!

ascillitoe avatar Sep 14 '22 08:09 ascillitoe