AssertionError: This shouldn't happen
Not too much info here, but I do get a stack trace. Looks like the page title may differ from what's expected?
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/lidarrmetadata/provider.py", line 687, in get_summary
return wikipedia.summary(title, auto_suggest=False)
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/wikipedia/util.py", line 28, in __call__
ret = self._cache[key] = self.fn(*args, **kwargs)
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 231, in summary
page_info = page(title, auto_suggest=auto_suggest, redirect=redirect)
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 276, in page
return WikipediaPage(title, redirect=redirect, preload=preload)
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 299, in __init__
self.__load(redirect=redirect, preload=preload)
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 357, in __load
assert normalized['from'] == self.title, ODD_ERROR_MESSAGE
AssertionError: This shouldn't happen. Please report on GitHub: github.com/goldsmith/Wikipedia
If it is an issue of the title differing, it may have something to do with the way I'm grabbing page titles from URLs. Here's the bit of my code that probably matters:
URL_REGEX = re.compile(r'https?://\w+\.wikipedia\.org/wiki/(?P<title>.+)')
title = URL_REGEX.match(url).group('title')
title = urllib.unquote(title)
wikipedia.summary(title, auto_suggest=False)
I'm having trouble reproducing the error. I've run the bit of your code that matters on a few random articles and everything returns fine. Are you able to provide a bit more detail about your environment and maybe a more specific example with a url which will reliably reproduce the error?
I'll try to see if I can reproduce it tomorrow, but it looks like there's an encoding error somewhere: 'ascii' codec can't decode byte 0xc3 in position 13: ordinal not in range(128). Don't know if that's on my side or not.
The actual url it's requesting is https://en.wikipedia.org/w/api.php?inprop=url&redirects=&format=json&ppprop=disambiguation&prop=info%7Cpageprops&titles=Sidiki_Diabat%C3%A9&action=query, so I'd bet it's the é causing the issue. I know of some encoding issues on my side and will check into that, but I'd find it strange that it can get the page and have an error occur within the wikipedia lib in that case.
I've done a bit of digging and it looks like it's an encoding error somewhere. I'm able to somewhat patch the problem by changing wikipedia.py:364 from
assert redirects['from'] == from_title, ODD_ERROR_MESSAGE
to
assert redirects['from'].encode('utf-8') == from_title, ODD_ERROR_MESSAGE
Unfortunately, I'm not sure where else needs these changes. It looks like this repo isn't currently maintained, so I'm testing out the fork lehinevych/MediaWikiAPI. So far, it doesn't seem to have this issue.
File "script.py", line 119, in visit
page = wikipedia.WikipediaPage(title=title)
File "/Users/iacopy/.virtualenvs/project/lib/python3.7/site-packages/wikipedia/wikipedia.py", line 299, in __init__
self.__load(redirect=redirect, preload=preload)
File "/Users/iacopy/.virtualenvs/project/lib/python3.7/site-packages/wikipedia/wikipedia.py", line 364, in __load
assert redirects['from'] == from_title, ODD_ERROR_MESSAGE
AssertionError: This shouldn't happen. Please report on GitHub: github.com/goldsmith/Wikipedia