Wikipedia icon indicating copy to clipboard operation
Wikipedia copied to clipboard

AssertionError: This shouldn't happen

Open danielunderwood opened this issue 7 years ago • 4 comments

Not too much info here, but I do get a stack trace. Looks like the page title may differ from what's expected?

File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/lidarrmetadata/provider.py", line 687, in get_summary
  return wikipedia.summary(title, auto_suggest=False)
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/wikipedia/util.py", line 28, in __call__
  ret = self._cache[key] = self.fn(*args, **kwargs)
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 231, in summary
  page_info = page(title, auto_suggest=auto_suggest, redirect=redirect)
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 276, in page
  return WikipediaPage(title, redirect=redirect, preload=preload)
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 299, in __init__
  self.__load(redirect=redirect, preload=preload)
File "/opt/metadata/lidarrapi.metadata/venv/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 357, in __load
  assert normalized['from'] == self.title, ODD_ERROR_MESSAGE
AssertionError: This shouldn't happen. Please report on GitHub: github.com/goldsmith/Wikipedia

If it is an issue of the title differing, it may have something to do with the way I'm grabbing page titles from URLs. Here's the bit of my code that probably matters:

URL_REGEX = re.compile(r'https?://\w+\.wikipedia\.org/wiki/(?P<title>.+)')
title = URL_REGEX.match(url).group('title')
title = urllib.unquote(title)
wikipedia.summary(title, auto_suggest=False)

danielunderwood avatar Dec 31 '17 16:12 danielunderwood

I'm having trouble reproducing the error. I've run the bit of your code that matters on a few random articles and everything returns fine. Are you able to provide a bit more detail about your environment and maybe a more specific example with a url which will reliably reproduce the error?

wbchilds avatar Jan 10 '18 23:01 wbchilds

I'll try to see if I can reproduce it tomorrow, but it looks like there's an encoding error somewhere: 'ascii' codec can't decode byte 0xc3 in position 13: ordinal not in range(128). Don't know if that's on my side or not.

The actual url it's requesting is https://en.wikipedia.org/w/api.php?inprop=url&redirects=&format=json&ppprop=disambiguation&prop=info%7Cpageprops&titles=Sidiki_Diabat%C3%A9&action=query, so I'd bet it's the é causing the issue. I know of some encoding issues on my side and will check into that, but I'd find it strange that it can get the page and have an error occur within the wikipedia lib in that case.

danielunderwood avatar Jan 12 '18 01:01 danielunderwood

I've done a bit of digging and it looks like it's an encoding error somewhere. I'm able to somewhat patch the problem by changing wikipedia.py:364 from assert redirects['from'] == from_title, ODD_ERROR_MESSAGE to assert redirects['from'].encode('utf-8') == from_title, ODD_ERROR_MESSAGE

Unfortunately, I'm not sure where else needs these changes. It looks like this repo isn't currently maintained, so I'm testing out the fork lehinevych/MediaWikiAPI. So far, it doesn't seem to have this issue.

danielunderwood avatar Apr 04 '18 22:04 danielunderwood

  File "script.py", line 119, in visit
    page = wikipedia.WikipediaPage(title=title)
  File "/Users/iacopy/.virtualenvs/project/lib/python3.7/site-packages/wikipedia/wikipedia.py", line 299, in __init__
    self.__load(redirect=redirect, preload=preload)
  File "/Users/iacopy/.virtualenvs/project/lib/python3.7/site-packages/wikipedia/wikipedia.py", line 364, in __load
    assert redirects['from'] == from_title, ODD_ERROR_MESSAGE
AssertionError: This shouldn't happen. Please report on GitHub: github.com/goldsmith/Wikipedia

iacopy avatar May 09 '20 12:05 iacopy