geeksforgeeks-pdf
geeksforgeeks-pdf copied to clipboard
Doesn't crawl the webpage
Changed line 16 to:
if 'href' in getattr(link, 'attrs', {}):
as it showed the error:
AttributeError: 'Doctype' object has no attribute 'has_attr'
It also shows a user warning as follows:
UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this
system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different
virtual environment, it may use a different parser and behave differently.
To get rid of this warning, change this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "lxml")
markup_type=markup_type))
Now, it doesn't crawl the page just prints:
Finished!
0