DGA-Detection icon indicating copy to clipboard operation
DGA-Detection copied to clipboard

something with alexa..

Open ibarkay opened this issue 6 years ago • 1 comments

raceback (most recent call last): File "dga_detection.py", line 314, in load_data() File "dga_detection.py", line 93, in load_data training_data = alexa.top_list(1000000) File "/Users/_______/Documents/PycharmProjects//venv/src/alexa-top-sites/alexa/init.py", line 32, in top_list return [a.next() for x in xrange(num)] StopIteration

thanx :)

ibarkay avatar Jul 20 '19 22:07 ibarkay

This is because the code tries to get a million domains from alexa:

dga_detection.py:93:

training_data = alexa.top_list(1000000)

alexa/__init__.py:32:

return [a.next() for x in xrange(num)]

…but the top-1m.csv.zip downloaded by alexa/__init__.py only has ~576k domains now for some reason:

$ wc -l top-1m.csv
576602 top-1m.csv

A proper fix would be to change alexa/__init__.py to use actual line count from the file, but if you just want a quick one, change the number in dga_detection.py:93:

training_data = alexa.top_list(576602)

(count the lines yourself, it probably changes often)

helb avatar Feb 24 '20 09:02 helb