Wikipedia
Wikipedia copied to clipboard
Disambiguation error gives titles that leads to the same error
Hi, this example happened working with wikipedia lang 'es'.
I'm trying to get wikipedia.page('Alfa Romeo Giulia')
and getting Disambiguation Error with this options:
['Alfa Romeo Giulia',
'Alfa Romeo Giulia GT Veloce',
'Alfa Romeo Giulia TZ',
'Alfa Romeo Giulia']
The first and last options lead me to the same error. I cannot get the actual URLs from the error arguments.
For this case they would be:
https://es.wikipedia.org/wiki/Alfa_Romeo_Giulia_(1962)
https://es.wikipedia.org/wiki/Alfa_Romeo_Giulia_GT_Veloce
https://es.wikipedia.org/wiki/Alfa_Romeo_Giulia_TZ
https://es.wikipedia.org/wiki/Alfa_Romeo_Giulia_(2015)
Thanks!
I have the same problem with wikipedia.summary. This happend with lang 'de'.
import wikipedia
wikipedia.set_lang("de")
try:
summary: str = wikipedia.summary("Schlacht von Pjöngjang")
except wikipedia.exceptions.DisambiguationError as e:
new_query = e.options[-1:][0] #select the last suggestion
summary: str = wikipedia.summary(new_query)
Yields another wikipedia.exceptions.DisambiguationError.
There is a quick fix that I tested with french, that seemed to work great. The main problem is that in the handling of the disambiguation, the code returns the HTML text in that list rather than the title which corresponds to the correct title of the page.
i.e.: Émancipation returns as its last element in the disambiguation list: "Emancipation". Which is bad because it is the same has the first search, but the title is: "Emancipation (Stargate)". A search with the title yielded the correct page.
So I updated this line on my wikipedia.py file from this:
may_refer_to = [li.a.get_text() for li in filtered_lis if li.a]
to this:
may_refer_to = [li.a.get('title') for li in filtered_lis if li.a]
So far it has worked great in my limited testing in french. There is always the posibility to return the href of each instead of the title to be able to call the page directly with a GET request.