python-goose
python-goose copied to clipboard
li tags in html not extracted
Please check the following site http://www.hiewatch.com/news/trump-transition-team-hears-interoperability-pitch I don't get the 4 points listed in the body of text
cause python-goose is using tons of hardcode value in class and function, you can take a deep look to those functions and consider change some values in them. example in cleaners.py
This was looked into and resolved in the python3 port of the library (also maintained): goose3
Full Disclosure: I help maintain goose3