ail-framework icon indicating copy to clipboard operation
ail-framework copied to clipboard

Web.py : unused var and regex matching twice

Open osagit opened this issue 3 years ago • 0 comments

Hi,

In Web.py we found starting line 84 a while loop with a 'x' var unused:

 domains_list = []
 PST = Paste.Paste(filename)
 client = ip2asn()
 for x in PST.get_regex(url_regex):
     matching_url = re.search(url_regex, PST.get_p_content())
     url = matching_url.group(0)

Moreover, PST.get_regex realize a re.findall() and then another same regex with re.search()

I suggest rewriting like this, and using set instead of array for domain list to prevent duplicated URLs:

            domains_list = set()
            PST = Paste.Paste(filename)
            client = ip2asn()
            detected_urls = PST.get_regex(self.url_regex)
            if len(detected_urls) > 0:
                to_print = 'Web;{};{};{};'.format(
                    PST.p_source, PST.p_date, PST.p_name)
                publisher.info('{}Detected {} URL;{}'.format(
                    to_print, len(detected_urls), PST.p_rel_path))

            for url in detected_urls:
                publisher.debug("match regex: %s" % (url))

                ...
line 110 -> domains_list.add(domain)

osagit avatar Apr 01 '21 14:04 osagit