Parsing issue, getting contents of select box
Issue by daltonch
Thu Mar 9 20:04:53 2017
Originally opened as https://github.com/codelucas/newspaper/issues/344
from newspaper import Article
url = "http://www.refworld.org/docid/58b03ed44.html"
article = Article(url)
article.download()
article.parse()
article.text
I get the following text,
"Search Refworld\n\nand / or country All countries Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antigua and Barbuda Argentina Armenia Aruba Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bermuda Bhutan Bolivia Bosnia and Herzegovina Botswana Brazil British Virgin Islands Brunei Darussalam Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Cayman Islands Central African Republic Chad Chile China Cocos (Keeling) Islands Colombia Comoros Congo, Democratic Republic of the Congo,...."
instead of the Actual Article Text. It appears to pull the contents of a select box w/ all the options. The page even has an