mtad-gat-pytorch
mtad-gat-pytorch copied to clipboard
about adjust_predicts() ,please!!!
First,thanks for making this repo public, and I have learned a lot from the issues, thanks for your reply. I have seen many times about that:
for i in range(len(predict)):
if any(actual[max(i, 0) : i + 1]) and predict[i] and not anomaly_state:
anomaly_state = True
anomaly_count += 1
for j in range(i, 0, -1):
if not actual[j]:
break
else:
if not predict[j]:
predict[j] = True
latency += 1
elif not actual[i]:
anomaly_state = False
if anomaly_state:
predict[i] = True
It's part of the "adjust_predicts", I am very curious, what is the purpose? And the how does the "latency" work out?
Dear @fffii ,
The adjust_predicts()
function essentially performs what is known in Anomaly Detection literature as the "Point Adjustment" strategy. You can read more about it for example here, but, in a nutshell, point adjustment is based on the following assumption:
In a real-world scenario, we are mainly interested in continuous anomalous segments rather than point anomalies. A good anomaly detector should be able to identify such events, without necessarily identifying all individual anomalous instances that they contain. Therefore, as long as at least 1 anomaly has been identified within an event, we can consider the entire event as "identified".
This implies that, during model evaluation, all instances of an event are regarded as "correctly identified anomalies", as long as at least one of them was correctly identified by the model. To show an example of this, suppose that we have the following labels for a time-series' data points:
Ground Truth = [0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1]
This corresponds to a time-series with 4 events: 2 point anomalies (near the start and at the end), as well as 2 continuous segments (one with 10 points and one with 3 points). Now let's suppose that our model identified the following labels:
Model Results = [0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1]
Our model correctly identifies the first and last event, i.e. the point anomalies. However, it wrongly identifies one point-anomaly at index No. 4 and completely misses the continuous segment that consists of 3 points. Nonetheless, it identifies some of the anomalies in the continuous segment that consists of 10 points. The corresponding point-adjusted labels are:
PA Results = [0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1]
What happened here? The wrongly identified point anomaly at index No. 4 remains there. Additionally, the unidentified continuous segment that consists of 3 points remains unidentified. However, the continuous segment that consists of 10 points is considered to be fully identified, simply because 3 points of it were successfully identified by the model.
My personal view on the matter is that Point Adjustment is a way of presenting very high Accuracies and F1-Scores, by hiding under the rug the truth about the actual capabilities of a model. This is why it has been heavily criticized the past few years and alternatives have been suggested in the literature.
Finally, as far as latency is concerned, this is simply "the time it took the model to identify an event". In the previous example, the model's latency in identifying the point anomalies is zero, because it correctly identified both of them. For the continuous segment of 3 anomalies, the latency is infinite, because the event was never identified. Finally, for the continuous segment of 10 anomalies, the latency is 4 timestamps, because the first 4 anomalies of the event were missed by the model.
Dear @fffii ,
The
adjust_predicts()
function essentially performs what is known in Anomaly Detection literature as the "Point Adjustment" strategy. You can read more about it for example here, but, in a nutshell, point adjustment is based on the following assumption:In a real-world scenario, we are mainly interested in continuous anomalous segments rather than point anomalies. A good anomaly detector should be able to identify such events, without necessarily identifying all individual anomalous instances that they contain. Therefore, as long as at least 1 anomaly has been identified within an event, we can consider the entire event as "identified".
This implies that, during model evaluation, all instances of an event are regarded as "correctly identified anomalies", as long as at least one of them was correctly identified by the model. To show an example of this, suppose that we have the following labels for a time-series' data points:
Ground Truth = [0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1]
This corresponds to a time-series with 4 events: 2 point anomalies (near the start and at the end), as well as 2 continuous segments (one with 10 points and one with 3 points). Now let's suppose that our model identified the following labels:
Model Results = [0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1]
Our model correctly identifies the first and last event, i.e. the point anomalies. However, it wrongly identifies one point-anomaly at index No. 4 and completely misses the continuous segment that consists of 3 points. Nonetheless, it identifies some of the anomalies in the continuous segment that consists of 10 points. The corresponding point-adjusted labels are:
PA Results = [0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1]
What happened here? The wrongly identified point anomaly at index No. 4 remains there. Additionally, the unidentified continuous segment that consists of 3 points remains unidentified. However, the continuous segment that consists of 10 points is considered to be fully identified, simply because 3 points of it were successfully identified by the model.
My personal view on the matter is that Point Adjustment is a way of presenting very high Accuracies and F1-Scores, by hiding under the rug the truth about the actual capabilities of a model. This is why it has been heavily criticized the past few years and alternatives have been suggested in the literature.
Finally, as far as latency is concerned, this is simply "the time it took the model to identify an event". In the previous example, the model's latency in identifying the point anomalies is zero, because it correctly identified both of them. For the continuous segment of 3 anomalies, the latency is infinite, because the event was never identified. Finally, for the continuous segment of 10 anomalies, the latency is 4 timestamps, because the first 4 anomalies of the event were missed by the model.
Thanks! It's very helpful to me, have a good day !