AIL-framework
AIL-framework copied to clipboard
Web.py : unused var and regex matching twice
Hi,
In Web.py we found starting line 84 a while loop with a 'x' var unused:
domains_list = []
PST = Paste.Paste(filename)
client = ip2asn()
for x in PST.get_regex(url_regex):
matching_url = re.search(url_regex, PST.get_p_content())
url = matching_url.group(0)
Moreover, PST.get_regex realize a re.findall() and then another same regex with re.search()
I suggest rewriting like this, and using set instead of array for domain list to prevent duplicated URLs:
domains_list = set()
PST = Paste.Paste(filename)
client = ip2asn()
detected_urls = PST.get_regex(self.url_regex)
if len(detected_urls) > 0:
to_print = 'Web;{};{};{};'.format(
PST.p_source, PST.p_date, PST.p_name)
publisher.info('{}Detected {} URL;{}'.format(
to_print, len(detected_urls), PST.p_rel_path))
for url in detected_urls:
publisher.debug("match regex: %s" % (url))
...
line 110 -> domains_list.add(domain)