botnet-detection
botnet-detection copied to clipboard
Undirected or directed? And how creat a graph?
Hi, i need your help.
- When i read your paper. I saw you said: "All the graphs are undirected and preprocessed to have self-loops to speed up training". Besides, you also can said: "we propose to use a random walk style normalization ̄A=D−1A which only involves the degree of the source nodes to equate the normalized adjacency matrix to the corresponding probability transition matrix". In here, you use "degree of source node" terms, i think this terms equivalent with "out-degree" terms. But "out-degree" terms only use for directed graph. So, it make me confuse, i can't understand your graph is undirected or directed.
- Why self-loop can speed up trainning? and What mean "normalized adjacency matrix to the corresponding probability transition matrix"?
- I see your code in botgen folder. It seem create a botnet by pick random some node. So, I want to ask, by randomly selecting bot nodes, is it possible to create a botnet with the same topology as in reality and why?
Thank for your help!
Hi there! Here are some clarifications
- "degree of source node": this is for normalizing the "message" on the edges. Each edge
e = (A, B)has two end nodes, one as "source node"Aand one as "destination node"B. These two nodes are usually different (except self-loops whenA = B). So here we just mean to use the degree of the source nodeA, instead of that ofB. This applies to both undirected and directed graphs (in undirected graphs, it is just the degree; in directed graphs, this could be in-degree or out-degree, in which we use would out-degree). - self-loops ensures that the
messages(hidden vectors from neighbors) received to a nodeBat some layerl, can be maintained at the next layerl + 1when a new round of message aggregation happens. This is because the current information at nodeBwill be passed to itself through self-loop. It is a convenient way of keeping the information by adding the self-loops in data without changing the model logic. - The botnet follows certain topologies in their connections. This is essentially to generate a botnet with the expected topology, and they overlay it onto a larger graph, by randomly matching the nodes. The subnetwork with the botnet topology will not change, just their locations could be different in the full network.
Hope this helps!
Thanks. You havel me a lot.