logbert icon indicating copy to clipboard operation
logbert copied to clipboard

The experimental data of the paper cannot be reproduced

Open chinahappyking opened this issue 3 years ago • 9 comments

hi, guo I have tried many times. The following results are always the same, which is far from the results in the paper. Is there any difference between the results in the paper and the code?

Can you add a wechat private chat?

dataset: hdfs git branch: main ==================== logbert ==================== best threshold: 0, best threshold ratio: 0.0 TP: 7602, TN: 549880, FP: 3488, FN: 3045 Precision: 68.55%, Recall: 71.40%, F1-measure: 69.95%

chinahappyking avatar Mar 06 '22 12:03 chinahappyking

Can you share your email?

Thanks

HelenGuohx avatar May 08 '22 17:05 HelenGuohx

I have the same issue. Did you end up can reproduce the results?

hniu1 avatar Oct 12 '22 17:10 hniu1

Can you share your email?

Thanks

[email protected]

chinahappyking avatar Oct 22 '22 14:10 chinahappyking

Thanks for reaching out!! My email is @.***

Best, Nick

On Sat, Oct 22, 2022 at 10:03 AM chinahappyking @.***> wrote:

Can you share your email?

Thanks

@.***

— Reply to this email directly, view it on GitHub https://github.com/HelenGuohx/logbert/issues/24#issuecomment-1287802717, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHVRJ64AIAB6RFL3CSG6RJTWEPX4VANCNFSM5QBCFMLQ . You are receiving this because you commented.Message ID: @.***>

hniu1 avatar Oct 22 '22 15:10 hniu1

I just tried and have the same results:

best threshold: 0, best threshold ratio: 0.0 TP: 7643, TN: 549806, FP: 3562, FN: 3004 Precision: 68.21%, Recall: 71.79%, F1-measure: 69.95%

I haven't looked (deeply) into the code so far, but is the training data really limited to n=4855, as the code in line 122 in file data_process.py seems to indicate?

generate_train_test(log_sequence_file, n=4855)

How can I train for better results?

jplasser avatar Jan 09 '23 20:01 jplasser

I removed n=4855 from the described code line in the previous comment and now I have a lot more training data available. I‘ll post about the results again.

jplasser avatar Jan 09 '23 21:01 jplasser

Here are my results after applying the above changes:

best threshold: 0, best threshold ratio: 0.0 TP: 6996, TN: 390662, FP: 95, FN: 3651 Precision: 98.66%, Recall: 65.71%, F1-measure: 78.88%

Recall and F1 are still lower than in the paper, which were P=87.02, R=78.10, and F1=82.32 Caveat> I stopped training after 60 epochs, so this could be a reason for the underperforming values.

jplasser avatar Jan 10 '23 10:01 jplasser

One more, after finishing training with a batch size of 512 with HDFS, val loss=0.183, train loss=0.178, 135 epochs, takes about 35 minutes on a RTX 3090.

best threshold: 0, best threshold ratio: 0.0 TP: 7583, TN: 390484, FP: 273, FN: 3064 Precision: 96.52%, Recall: 71.22%, F1-measure: 81.97%

jplasser avatar Jan 10 '23 11:01 jplasser

Here is my result, training with a batch size of 512 with HDFS, val loss=0.537, train loss=0.451, 87 epochs, takes about 39 minutes on a RTX 3090.

best threshold: 0, best threshold ratio: 0.0 TP: 7908, TN: 389836, FP: 921, FN: 2739 Precision: 89.57%, Recall: 74.27%, F1-measure: 81.21% elapsed_time: 561.5744488239288

Yudi-Pan avatar Apr 19 '23 03:04 Yudi-Pan