GEE
                                
                                
                                
                                    GEE copied to clipboard
                            
                            
                            
                        Pytorch implementation of GEE: A Gradient-based Explainable Variational Autoencoder for Network Anomaly Detection
GEE: A Gradient-based Explainable Variational Autoencoder for Network Anomaly Detection
Details in blog post: https://blog.munhou.com/2020/07/12/Pytorch-Implementation-of-GEE-A-Gradient-based-Explainable-Variational-Autoencoder-for-Network-Anomaly-Detection/
How to Use
Install Dependencies
Create a new conda environment
conda create -n gee python=3.7.7
conda activate gee 
conda install pyspark=3.0.0 click=7.1.2 jupyterlab=2.1.5 seaborn=0.10.1
conda install pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=10.1 -c pytorch
conda install pytorch-lightning=0.8.4 shap=0.35.0 -c conda-forge
pip install petastorm==0.9.2
Feature Extraction
Download the processed data here or perform all the following steps.
- Download raw data march_week3_csv.tar.gz and july_week5_csv.tar.gz.
 - Decompress files.
tar -xvf march_week3_csv.tar.gz tar -xvf july_week5_csv.tar.gz - Separate files by date.
grep '^2016-03-18' march.week3.csv.uniqblacklistremoved >> 20160318.csv grep '^2016-03-19' march.week3.csv.uniqblacklistremoved >> 20160319.csv grep '^2016-03-20' march.week3.csv.uniqblacklistremoved >> 20160320.csv grep '^2016-07-30' july.week5.csv.uniqblacklistremoved >> 20160730.csv grep '^2016-07-31' july.week5.csv.uniqblacklistremoved >> 20160731.csv - Put 
20160319.csvand20160730.csvtodata/trainfolder,20160318.csv,20160320.csv, and20160731.csvtodata/testfolder. - Perform feature extraction.
python feature_extraction.py --train data/train --test data/test --target_train feature/train.feature.parquet --target_test feature/test.feature.parquet 
Normalise and Prepare Input Data for Model
Download the processed data here or perform all the following steps.
python build_model_input.py --train feature/train.feature.parquet --test feature/test.feature.parquet --target_train model_input/train.model_input.parquet --target_test model_input/test.model_input.parquet
Train Model
Download pre-trained model here or perform all the following steps.
python train_vae.py --data_path model_input/train.model_input.parquet --model_path model/vae.model --gpu True
Evaluation
ROC

Reconstruction Error Distribution

Gradient
