Wikipedia-API
Wikipedia-API copied to clipboard
Newline / Space missing from .summary attribute
The .summary attribute of a page does not include a newline or space after a sentence that ends in hard brackets [ ] on the Wikipedia page.
Example:
wiki = wiki_api.Wikipedia(language="en")
query = "planet"
page = wiki.page(query)
text = page.summary
print(text[:400])
which queries the article: https://en.wikipedia.org/wiki/Planet
and returns:
A planet is an astronomical body orbiting a star or stellar remnant that is massive enough to be rounded by its own gravity, is not massive enough to cause thermonuclear fusion, and – according to the International Astronomical Union but not all planetary scientists – has cleared its neighbouring region of planetesimals.The term planet is ancient, with ties to history, astrology, science, mytholog
Observe the lack of space between planetesimals.
and The
at the first paragraph, which ends with "planetesimals.[b][1][2]" on the web-page.
Whilst later in the summary, at
print(text[1200:1500])
There is a space between "discovered)." and "Ptolemy" as expected:
the scientific community are no longer viewed as such under the current definition. Some of the excluded objects include Ceres, Pallas, Juno, Vesta (all of which are objects in the solar asteroid belt), and Pluto (the first trans-Neptunian object discovered). Ptolemy thought that the planets orbite
Please let me know if any additional information is needed to fix this, or if there is a workaround.