aktaion2
aktaion2 copied to clipboard
Relevancy of different features
Looking at the list of micro-behaviors (or features) you use for machine learning, do you have any insight which features are more relevant for detection? How did you come up with ideas for new features to use, do you have a list of features/behaviors you would like to implement in future?
Awesome project and great research, btw!
@mgalushka thanks for the question and feedback! This is a great question we originally started from a list of features we had hand built in v1 of this project that was written in Java/Scala ( Our initial abstraction for an API was here : https://github.com/jzadeh/aktaion/blob/master/src/main/scala/com.aktaion/ml/behaviors/MicroBehaviorLogic.scala and then we built essentially some hand built feature using this API for exploit delivery behaviors here : https://github.com/jzadeh/aktaion/blob/master/src/main/scala/com.aktaion/ml/behaviors/ExploitationBehaviors.scala). These where motivated by a bunch of different research influences particularly a paper we read that built graph similarly logic using redirect chains : Detecting malicious HTTP redirections using trees of user browsing activity, Hesham Mekky et. Al. “…We build per-user chains from passively collected traffic and extract novel statistical features from them, which capture inherent characteristics from malicious redirection cases. Then, we apply a supervised decision tree classifier to identify malicious chains. Using a large ISP dataset, with more than 15K clients, we demonstrate that our methodology is very effective in accurately identifying malicious chains, with recall and precision values over 90% and up to 98%” IEEE INFOCOM 2014 - IEEE Conference on Computer Communications
@mgalushka as far as our list of features go @rsfl is maintaining a large list of todos for additional behaviors and use cases. We have sort of been slowly migrating some of the new ideas to this stand alone workflow we are calling Chiron that is focused on intrusion detection for the home https://github.com/jzadeh/chiron and some other use cases like that where we end up implanting a specific list of features for some sub problems. Let us know if you want to brainstorm or dive into this feature lists any feedback would be much appreciated.