Detect-phishing-websites-using-ML
Detect-phishing-websites-using-ML copied to clipboard
Detect-phishing-websites-using-ML
This project is a simple example which trains the model to predict phishing websites. Phishing websites are fake websites which try to gain the trust of users to steal private data of users.
- Best accuracy score - 97.0% using Random forest method
- Worst accuract score - 48.5% using One class svm method
Requirements
- Scikit-learn (sklearn)
- Numpy
Requirements can be installed by executing pip install -r requirements.txt
Data set
The data set for training has been taken from UCI archive
Execution
-
python classifier.py
to check the accuracy of the script. -
python classifier.py google.com
to check whether google.com is phishing website or not.
Parameters in dataset
Each value in the dataset contains all these elements and all are seperated by a comma.
- having_IP_Address { -1,1 }
- URL_Length { 1,0,-1 }
- Shortining_Service { 1,-1 }
- having_At_Symbol { 1,-1 }
- double_slash_redirecting { -1,1 }
- Prefix_Suffix { -1,1 }
- having_Sub_Domain { -1,0,1 }
- SSLfinal_State { -1,1,0 }
- Domain_registeration_length { -1,1 }
- Favicon { 1,-1 }
- port { 1,-1 }
- HTTPS_token { -1,1 }
- Request_URL { 1,-1 }
- URL_of_Anchor { -1,0,1 }
- Links_in_tags { 1,-1,0 }
- SFH { -1,1,0 }
- Submitting_to_email { -1,1 }
- Abnormal_URL { -1,1 }
- Redirect { 0,1 }
- on_mouseover { 1,-1 }
- RightClick { 1,-1 }
- popUpWidnow { 1,-1 }
- Iframe { 1,-1 }
- age_of_domain { -1,1 }
- DNSRecord { -1,1 }
- web_traffic { -1,0,1 }
- Page_Rank { -1,1 }
- Google_Index { 1,-1 }
- Links_pointing_to_page { 1,0,-1 }
- Statistical_report { -1,1 }
- Result { -1,1 }