geeksforgeeks-pdf Doesn't crawl the webpage

Doesn't crawl the webpage

Open madhuradlakha opened this issue 9 years ago • 0 comments

Changed line 16 to:

if 'href' in getattr(link, 'attrs', {}): as it showed the error:

AttributeError: 'Doctype' object has no attribute 'has_attr'

It also shows a user warning as follows:

UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change this:

BeautifulSoup([your markup])

to this:

BeautifulSoup([your markup], "lxml")

markup_type=markup_type))

Now, it doesn't crawl the page just prints: Finished! 0

Jan 28 '16 19:01 madhuradlakha

geeksforgeeks-pdf geeksforgeeks-pdf copied to clipboard

Doesn't crawl the webpage

geeksforgeeks-pdf
geeksforgeeks-pdf copied to clipboard