Wikipedia icon indicating copy to clipboard operation
Wikipedia copied to clipboard

Handling of suggestions could be improved

Open slzatz opened this issue 5 years ago • 2 comments

This line in the page function:

title = suggestion or results[0]

does not seem correct. It appears that it should at least check if results[0] exactly matches the title that the user provided. In other words something like:

title = title if title == results[0] else suggestion

For a concrete example, try searching on "Steven Pinker" with auto_suggest on. The suggestion will be "Stephen Pinker," which is incorrect and since the suggestion wins in the current implementation, you get an exception rather than finding the page whose title you entered correctly.

slzatz avatar Apr 18 '20 15:04 slzatz

This modified version of the wikipedia.page() function seems to produce decent results (note also the change to 4-space indentation):

def ap_page(title=None, pageid=None, auto_suggest=True, redirect=True, preload=False):
    # Copied from wikipedia.py, modified to fix poor search results.
    # See https://github.com/goldsmith/Wikipedia/issues/227.

    '''
    Get a WikipediaPage object for the page with title `title` or the pageid
    `pageid` (mutually exclusive).

    Keyword arguments:

    * title - the title of the page to load
    * pageid - the numeric pageid of the page to load
    * auto_suggest - let Wikipedia find a valid page title for the query
    * redirect - allow redirection without raising RedirectError
    * preload - load content, summary, images, references, and links during initialization
    '''

    if not title and not pageid:
        raise ValueError("Either a title or a pageid must be specified")

    if pageid:
        return wikipedia.WikipediaPage(pageid=pageid, preload=preload)

    if title:
        if auto_suggest:
            results, suggestion = wikipedia.search(title, results=1, suggestion=True)

            if results:
                if len(results) == 1:
                    # One auto-suggested result: use it.
                    title = results[0]
                elif title == results[0]:
                    # Multiple suggestions but first is exact title match: use it.
                    title = title
            elif suggestion:
                title = suggestion
            else:
                # No results and no suggestion.
                raise wikipedia.PageError(title)

        return wikipedia.WikipediaPage(title, redirect=redirect, preload=preload)

alphapapa avatar Oct 20 '20 03:10 alphapapa

On the other hand, this library seems to work well as an alternative: https://github.com/barrust/mediawiki

alphapapa avatar Oct 20 '20 04:10 alphapapa