DGA-Detection
DGA-Detection copied to clipboard
something with alexa..
raceback (most recent call last):
File "dga_detection.py", line 314, in
thanx :)
This is because the code tries to get a million domains from alexa:
training_data = alexa.top_list(1000000)
return [a.next() for x in xrange(num)]
…but the top-1m.csv.zip downloaded by alexa/__init__.py only has ~576k domains now for some reason:
$ wc -l top-1m.csv
576602 top-1m.csv
A proper fix would be to change alexa/__init__.py to use actual line count from the file, but if you just want a quick one, change the number in dga_detection.py:93:
training_data = alexa.top_list(576602)
(count the lines yourself, it probably changes often)