deepreflect icon indicating copy to clipboard operation
deepreflect copied to clipboard

How to reproduce the MSE value of AutoEncoder described in the paper?

Open wannch opened this issue 3 years ago • 9 comments

Hello! I am very interested in your research work and I am trying to reproduce the research results of your paper. As described in Section 4.2 (Model Setup) of your paper: We trained the model for a maximum of 10 epochs and we obtained a training MSE of 2.5090e-07 and testing MSE of 2.1575e-07 – recall that a lower the MSE value means a better reconstruction of the benign samples. It took roughly 40 hours to train the model on an NVIDIA GeForce RTX 2080Ti GPU.

I run this AutoEncoder model on an Nvidia Quadro RTX 5000, but after trying a few times I only get an MSE value of about 5.4e-07 on the training set and an MSE value of about 5.1e-07 on the validation, which is twice as large as the result stated in your paper! The resullt of the next step to extract ROIs on the malicious dataset is also far from the results you gave.

I would like to ask, what is the reason for this result? Is there any way to achieve good results as described in your paper?

wannch avatar Jan 14 '22 07:01 wannch

Hi, Sir, did you use the ACFG+PLUS benign dataset of author. similary for malware sample ACFG+PLUS is required to produce the almost same results. both dataset are available in the feature.tar.gz file shared by Author. It may be helpful if you did not try it.

nav60 avatar Jan 20 '22 17:01 nav60

Hi, Sir, thanks for your reply. I'm sure I'm using the ACFG+PLUS benign and malware dataset shared by Author. I want to know if it's essential to checkout 0.0.1 version for reproducing? I'm using master branch version. I noticed some differences in the code between your master branch and version 0.0.1. The version of the framework used is also different.

wannch avatar Jan 23 '22 07:01 wannch

Hi, Author have shared all version of his work but to reproduce the result master/last version is required to reproduce the work. regarding result, author can help you.

nav60 avatar Jan 23 '22 07:01 nav60

OK. Thank you, Sir. I do feel a little confused because I used the same method as the author, but the results were very different. I have contacted the author and he is very busy. I'm patiently waiting for his response.

wannch avatar Jan 23 '22 07:01 wannch

Sir, I'm sorry to bother you again, Could you tell me which version of the HDBSCAN package did you use in your v0.0.1 tag source?

wannch avatar Feb 08 '22 13:02 wannch

well, i had used the latest or last version of deepreflect. did not use initial version of project. However, can you share what is the problem. see the following link to install specific version of the tool mention in the link. ""https://github.com/evandowning/deepreflect/issues/8#issuecomment-953559015

nav60 avatar Feb 08 '22 15:02 nav60

Hi all. Sorry for the delayed reply.

I'm not sure why you're getting a different result. I retrained my model from scratch before releasing the dataset and model to ensure consistency with my results.

There shouldn't be a functional difference between v0.0.1 and the latest release, but for consistency let's stick with v0.0.1 for now.

This is reminiscent of other issues people have seen before:

  • https://stackoverflow.com/questions/46303920/tensorflow-different-results-on-different-gpus
  • https://discuss.pytorch.org/t/different-training-results-on-different-machines-with-simplified-test-code/59378

Could you provide details on your setup? (e.g., specific OS & version, Python3 version, CUDA version, etc.).

Could you also try retraining the model from scratch multiple times and see if you don't get a different result?

evandowning avatar Feb 09 '22 17:02 evandowning

OK, sir. I run the file pca_hdbscan.py to get the clustering result, but the results are different every time. Could you tell me how you got the results you gave? After getting the clustering results, how did you label the clusters? Maybe we need to manually label it ourselves?

wannch avatar Feb 10 '22 14:02 wannch

Hi all. Sorry for the delayed reply.

I'm not sure why you're getting a different result. I ensured to retrain my model from scratch before releasing the dataset and model to ensure consistency with my results.

There shouldn't be a functional difference between v0.0.1 and the latest release, but for consistency let's stick with v0.0.1 for now.

This is reminiscent of other issues people have seen before:

  • https://stackoverflow.com/questions/46303920/tensorflow-different-results-on-different-gpus
  • https://discuss.pytorch.org/t/different-training-results-on-different-machines-with-simplified-test-code/59378

Could you provide details on your setup? (e.g., specific OS & version, Python3 version, CUDA version, etc.).

Could you also try retraining the model from scratch multiple times and see if you don't get a different result?

thanks for your reply, sir. I'm using the latest release and train AE model for serveral times, every time I got the same failed result. my setup details: OS: ubuntu 18.04 Python version: 3.7.3 (all package dependencies resovled by pip install -r requirements) Tensorflow : 2.7.0 CUDA: 11.2

wannch avatar Feb 10 '22 14:02 wannch