DeepLog Question regarding the predicted variable

trafficstars

Yifan,

Source: LogKeyModel_predict.py

In the code below, can you please explain the difference between the output and predicted variables? Is output the same as predicted except it being sorted in tensors? Also, shouldn't the value of the predicted variable be something binary so that we can determine whether the predicted outcome is anomalous or not?

output = model(seq)
predicted = torch.argsort(output, 1)[0][-num_candidates:]

Thanks, Deep

Sep 03 '20 13:09 nagsubhadeep

Deep, The output is a probability distribution describing the probability for each log key to appear as the next log key value given the history.

Sep 03 '20 13:09 wuyifan18

Shouldn't the value of the predicted variable be something binary so that we can determine whether the predicted outcome is anomalous or not? I am getting a one-dimensional array instead.

Sep 03 '20 13:09 nagsubhadeep

Sort the possible log keys based on their probabilities and treat a key value as normal if it’s among the top g candidates. A log key is flagged as being from an abnormal execution otherwise.

You can read the paper for details.

Sep 03 '20 14:09 wuyifan18

@wuyifan18 where can I modify top g in your code?

Jun 23 '21 20:06 Rufaida94

@Rufaida94 here https://github.com/wuyifan18/DeepLog/blob/502aaf05be4c1251b7dc96f6439025c4fc988c66/LogKeyModel_predict.py#L51

Jun 24 '21 16:06 wuyifan18

than you @wuyifan18 , I know that num_candidates here is a hyperparameter that is supposed to be changed according to the dataset. But my question is if my data has 24297 num_classes (while your HDFS dataset has only 28 num_classes) what can be a reasonable num_candidates? for example is 1000 too high or too low for num_candidates? I know this is a very vague question but any pointers are appreciated.

Jul 03 '21 21:07 Rufaida94

@Rufaida94 the num_candidates is a hyperparameter, which means you should adjust it according to the metrics, such as F1 measure.

Jul 05 '21 02:07 wuyifan18

DeepLog DeepLog copied to clipboard

Question regarding the predicted variable

DeepLog
DeepLog copied to clipboard