Site icon indicating copy to clipboard operation
Site copied to clipboard

Missing and fake locations in Wikipedia POI layer

Open zstadler opened this issue 1 year ago • 2 comments

What happened?

Following a phone discussion and a user bug report

The current algorithm for adding Wikipedia POIs to IHM has a few issues:

  1. It may drop Wikipedia pages that should be shown or merged with OSM POIs.
  2. It may show Wikipedia pages that do not have a location.
  3. It is not scalable

This issue proposes to reverse the search direction. Rather than searching Wikipedia for pages that exists within a given bounding box, the new approach would identify the relevant Wikipedia pages based on OSM tags.

The down-side for such a change is that the IHM POI layer will no longer show Wikipedia pages that do not have an OSM element.

The identification could use the the wikipedia, wikipedia:he, and wikipedia:en tags, as used in the POI merge algorithm, or it can rely solely on wikidata tags, similar to openmaptiles.

Note: The current search-and-match algorithm could be used for periodic bulk OSM edits that will add the missing tag/s.

What steps will reproduce the bug?

The POI layer does not include a Wikipedia POI for זיתן, and the OSM-based POI for זיתן does not have a Wikipedia link

Hebrew Wikipedia has a page for זיתן with a location. image

The Hebrew Wikipedia API returns that page

{
  "batchcomplete": "",
  "query": {
    "geosearch": [
      {
        "pageid": 44693,
        "ns": 0,
        "title": "זיתן",
        "lat": 31.9756126464499,
        "lon": 34.8905390342779,
        "dist": 22.9,
        "primary": ""
      }
    ]
  }
}

There is also an opposite case where a Wikipedia page that does not have a location is shown as a IHM POI The Wikipedia page בית הכנסת צלאת בן שאיף does not have a location, it is not returned by the Wikipedia API, but it exists as an IHM POI.

What I expect to happen

Use OSM tags to identify Wikipedia pages

Platform

  • [X] Israel Hiking Map app
  • [X] Israel Hiking Map site in a browser

OS Name and Version

All

Browser Name and Version

All

Additional information

These problems with the contents of the Wikipedia POI layer, may also be related to #1971.

zstadler avatar Feb 10 '24 11:02 zstadler

Note that many Wikipedia POIs are not exactly at the right coordinates. This is natural for an encyclopedia that is mostly text based and less about exact geo like osm. When merging, my recommendation is to rely on OSM for location.

Tsurha avatar Mar 02 '24 16:03 Tsurha

If the POI exists in OSM and merged to it, the location is from OSM, not Wikipedia.

HarelM avatar Mar 02 '24 19:03 HarelM

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days

github-actions[bot] avatar Jun 01 '24 00:06 github-actions[bot]