iD icon indicating copy to clipboard operation
iD copied to clipboard

Implement Wikimedia Commons as an image provider

Open PinguDEV-original opened this issue 1 year ago • 5 comments

Description

I would like to have an implementation of wikimedia commons images with coordinates, there should be an overlay for that.

Basically the same overlay as for Mapillary, but for Wikimedia Commons (so you can use that for mapping). Ideally only show images under CC0 license (or else using the images is not allowed (i think?)).

Screenshots

No response

PinguDEV-original avatar Dec 31 '24 14:12 PinguDEV-original

A first step would be to investigate which API to use to efficiently fetch all georeferenced images from wikimedia commons.

This could perhaps be a nice student (e.g. gsoc) project?!

tyrasd avatar Jan 07 '25 10:01 tyrasd

A first step would be to investigate which API to use to efficiently fetch all georeferenced images from wikimedia commons.

https://commons.wikimedia.org/w/api.php?action=query
&generator=geosearch
&ggsnamespace=6
&ggsprimary=all
&ggsbbox=<toplat>|<leftlon>|<bottomlat>|<rightlon>
&ggslimit=100
&prop=imageinfo|coordinates
&iiprop=url
&iiurlwidth=300
&format=json

Example query:

https://commons.wikimedia.org/w/api.php?action=query&generator=geosearch&ggsnamespace=6&ggsprimary=all&ggsbbox=11.5399|76.0197|11.5398|76.0198&ggslimit=100&prop=imageinfo|coordinates&iiprop=url&iiurlwidth=300&format=json

Example response:

{
  "continue": {
    "iicontinue": "State_of_the_Map_Kerala_2024_Day_1_(194).jpg|20241119191948",
    "cocontinue": "155366027|709250395",
    "continue": "||"
  },
  "query": {
    "pages": {
      "155365984": {
        "pageid": 155365984,
        "ns": 6,
        "title": "File:State of the Map Kerala 2024 Day 1 (3).jpg",
        "index": -1,
        "coordinates": [
          {
            "lat": 11.539844,
            "lon": 76.019777,
            "primary": "",
            "globe": "earth"
          }
        ]
      },
      "155366027": {
        "pageid": 155366027,
        "ns": 6,
        "title": "File:State of the Map Kerala 2024 Day 1 (22).jpg",
        "index": 9
      },
      [...]
    }
  }
}

The first result:

https://commons.wikimedia.org/wiki/File:State_of_the_Map_Kerala_2024_Day_1_(3).jpg

References:

  • https://www.mediawiki.org/wiki/Extension:GeoData#API
  • https://commons.wikimedia.org/wiki/Commons:Search_by_location

gy-mate avatar Jul 30 '25 17:07 gy-mate

Further research is needed on how to get bearing values. (If possible at all.)

gy-mate avatar Jul 30 '25 17:07 gy-mate

Further research is needed on how to get bearing values. (If possible at all.)

If an image has been properly tagged with its heading, it’ll be available in the file’s structured data. The SotM Kerala photo doesn’t have a heading, but this photo I took does, as a heading (P7787) qualifier on a coordinates of the point of view (P1259) statement. If a geosearch query turns up this file, you can take the pageid 170882320 and access its structured data in JSON format using the URL https://commons.wikimedia.org/wiki/Special:EntityData/M170882320.json. The heading is at the path .entities.M170882320.statements.P1259[0].qualifiers.P7787[0].datavalue.value.amount. If you need to get this information in bulk, you’ll want to call the Wikidata Query Service API instead.

Alternatively, if you only want headings from EXIF metadata and don’t mind omitting manually entered headings, add metadata to the iiprop parameter.

1ec5 avatar Sep 11 '25 06:09 1ec5

I think the code of the Wikimedia tool WikiShootMe is the best candidate. Looking at its source code, here its API:

Example query SPARQL to get pictures in a square:

https://w.wiki/GabZ

Example query SPARQL (if above link is rot):

#TOOL: WikiShootMe
SELECT ?q ?qLabel ?location ?image ?reason ?desc ?commonscat ?street WHERE {
  SERVICE wikibase:box {
    ?q wdt:P625 ?location.
    bd:serviceParam wikibase:cornerSouthWest "Point(9.18685019016266 45.46306165193207)"^^geo:wktLiteral;
      wikibase:cornerNorthEast "Point(9.197149872779848 45.46546955969599)"^^geo:wktLiteral.
  }
  OPTIONAL { ?q wdt:P18 ?image. }
  OPTIONAL { ?q wdt:P373 ?commonscat. }
  OPTIONAL { ?q wdt:P969 ?street. }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en,en,de,fr,es,it,nl".
    ?q schema:description ?desc;
      rdfs:label ?qLabel.
  }
}
LIMIT 3000

Example request to get srx XML file:

https://query.wikidata.org/bigdata/namespace/wdq/sparql?query=%23TOOL%3A%20WikiShootMe%0ASELECT%20%3Fq%20%3FqLabel%20%3Flocation%20%3Fimage%20%3Freason%20%3Fdesc%20%3Fcommonscat%20%3Fstreet%20WHERE%20%7B%20SERVICE%20wikibase%3Abox%20%7B%20%3Fq%20wdt%3AP625%20%3Flocation%20.%20bd%3AserviceParam%20wikibase%3AcornerSouthWest%20%22Point(9.18685019016266%2045.46306165193207)%22%5E%5Egeo%3AwktLiteral%20.%20bd%3AserviceParam%20wikibase%3AcornerNorthEast%20%22Point(9.197149872779848%2045.46546955969599)%22%5E%5Egeo%3AwktLiteral%20%7D%20%20OPTIONAL%20%7B%20%3Fq%20wdt%3AP18%20%3Fimage%20%7D%20%20OPTIONAL%20%7B%20%3Fq%20wdt%3AP373%20%3Fcommonscat%20%7D%20%20OPTIONAL%20%7B%20%3Fq%20wdt%3AP969%20%3Fstreet%20%7D%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%2Cen%2Cde%2Cfr%2Ces%2Cit%2Cnl%22%20.%20%3Fq%20schema%3Adescription%20%3Fdesc%20.%20%3Fq%20rdfs%3Alabel%20%3FqLabel%20%7D%20%7D%20LIMIT%203000

Example command to get srx XML file:

curl 'https://query.wikidata.org/bigdata/namespace/wdq/sparql?query=%23TOOL%3A%20WikiShootMe%0ASELECT%20%3Fq%20%3FqLabel%20%3Flocation%20%3Fimage%20%3Freason%20%3Fdesc%20%3Fcommonscat%20%3Fstreet%20WHERE%20%7B%20SERVICE%20wikibase%3Abox%20%7B%20%3Fq%20wdt%3AP625%20%3Flocation%20.%20bd%3AserviceParam%20wikibase%3AcornerSouthWest%20%22Point(9.18685019016266%2045.46306165193207)%22%5E%5Egeo%3AwktLiteral%20.%20bd%3AserviceParam%20wikibase%3AcornerNorthEast%20%22Point(9.197149872779848%2045.46546955969599)%22%5E%5Egeo%3AwktLiteral%20%7D%20%20OPTIONAL%20%7B%20%3Fq%20wdt%3AP18%20%3Fimage%20%7D%20%20OPTIONAL%20%7B%20%3Fq%20wdt%3AP373%20%3Fcommonscat%20%7D%20%20OPTIONAL%20%7B%20%3Fq%20wdt%3AP969%20%3Fstreet%20%7D%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%2Cen%2Cde%2Cfr%2Ces%2Cit%2Cnl%22%20.%20%3Fq%20schema%3Adescription%20%3Fdesc%20.%20%3Fq%20rdfs%3Alabel%20%3FqLabel%20%7D%20%7D%20LIMIT%203000' \
  --compressed \
  -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:140.0) Gecko/20100101 Firefox/140.0' \
  -H 'Accept: application/json, text/javascript, */*; q=0.01' \
  -H 'Accept-Language: it,en;q=0.5' \
  -H 'Accept-Encoding: gzip, deflate, br, zstd' \
  -H 'Origin: https://wikishootme.toolforge.org' \
  -H 'DNT: 1' \
  -H 'Sec-GPC: 1' \
  -H 'Connection: keep-alive' \
  -H 'Referer: https://wikishootme.toolforge.org/' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Site: cross-site' \
  -H 'Priority: u=0' \
  -H 'TE: trailers'

Example output of the srx XML file:

<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
	<head>
		<variable name='q'/>
		<variable name='qLabel'/>
		<variable name='location'/>
		<variable name='image'/>
		<variable name='reason'/>
		<variable name='desc'/>
		<variable name='commonscat'/>
		<variable name='street'/>
	</head>
	<results>
		<result>
			<binding name='q'>
				<uri>http://www.wikidata.org/entity/Q15121</uri>
			</binding>
			<binding name='location'>
				<literal datatype='http://www.opengis.net/ont/geosparql#wktLiteral'>Point(9.190336111 45.464161111)</literal>
			</binding>
			<binding name='image'>
				<uri>http://commons.wikimedia.org/wiki/Special:FilePath/Milano%20-%20palazzo%20Isimbardi%20-%20facciata.jpg</uri>
			</binding>
			<binding name='commonscat'>
				<literal>Province of Milan (Italy)</literal>
			</binding>
			<binding name='desc'>
				<literal xml:lang='en'>province in the Lombardy region, Italy</literal>
			</binding>
			<binding name='qLabel'>
				<literal xml:lang='en'>province of Milan</literal>
			</binding>
		</result>
		<result>
			<binding name='q'>
				<uri>http://www.wikidata.org/entity/Q891865</uri>
			</binding>
			<binding name='location'>
				<literal datatype='http://www.opengis.net/ont/geosparql#wktLiteral'>Point(9.194095 45.463302)</literal>
			</binding>
			<binding name='image'>
				<uri>http://commons.wikimedia.org/wiki/Special:FilePath/Milano%20-%20Piazza%20Fontana%20-%20Banca%20Nazionale%20dell%27Agricoltura.jpg</uri>
			</binding>
			<binding name='commonscat'>
				<literal>Piazza Fontana bombing in Milan</literal>
			</binding>
			<binding name='desc'>
				<literal xml:lang='en'>terrorist attack carried out in Milan in 1969</literal>
			</binding>
			<binding name='qLabel'>
				<literal xml:lang='en'>Piazza Fontana bombing</literal>
			</binding>
		</result>
	</results>
</sparql>

More info about WikiShootMe:

https://meta.wikimedia.org/wiki/WikiShootMe

valerio-bozzolan avatar Dec 10 '25 11:12 valerio-bozzolan

@valerio-bozzolan I think your query only considers wikidata items which have an imageP18 property? but we want every photo from commons, even the ones with no linked wikidata item.

For anyone following this issue: there's a proposal in #11666, with several API-related questions in the PR description. demo here.

k-yle avatar Dec 15 '25 12:12 k-yle

You're right @k-yle - thanks! I was not getting the correct HTTP call from my browser web, since the call to Wikimedia Commons is not an Ajax (XHR) call, but it's JavaScript call (!). So there is a <script> tag dynamically included every time you move the map (!) WTF...

Anyway:

HTTP call to get Wikimedia Commons images in an area (JavaScript result):

https://commons.wikimedia.org/w/api.php?callback=jQuery31002935841652889848_1765808017220&action=query&list=geosearch&gsbbox=45.476382864438314%7C9.034881591796877%7C45.4468846182856%7C9.199676513671877&gsnamespace=6&gslimit=500&format=json&_=1765808017243

Wikimedia Commons images in an area, in JSON:

https://commons.wikimedia.org/w/api.php?action=query&list=geosearch&gsbbox=45.476382864438314|9.034881591796877|45.4468846182856|9.199676513671877&gsnamespace=6&gslimit=500&format=json

But NOTE: it does not return image URIs.


API documentation:

https://www.mediawiki.org/wiki/API:Geosearch

https://en.wikipedia.org/w/api.php?action=help&modules=query+geosearch

More relevant API documentation:

https://en.wikipedia.org/w/api.php?action=help&modules=query

Note: Generator parameter names must be prefixed with a "g"

https://en.wikipedia.org/w/api.php?action=help&modules=query%2Bimageinfo

So technically is possible to use geosearch as "generator" + imageinfo to get file URIs, in a single call:

Proposed API with geosearch and imageinfo

Wikimedia Commons images in an area, in JSON, and with image URIs (🤩)

https://commons.wikimedia.org/w/api.php?action=query&generator=geosearch&prop=imageinfo&iiprop=url&ggsbbox=45.476382864438314|9.034881591796877|45.4468846182856|9.199676513671877&ggsnamespace=6&ggslimit=500&format=json

Example result:

...
"url": "https://upload.wikimedia.org/wikipedia/commons/f/f4/Milano_Statua_di_Verdi.jpg",
"descriptionurl": "https://commons.wikimedia.org/wiki/File:Milano_Statua_di_Verdi.jpg",
...

Example file URL: https://commons.wikimedia.org/wiki/File:Milano_Statua_di_Verdi.jpg

Example file description: https://upload.wikimedia.org/wikipedia/commons/f/f4/Milano_Statua_di_Verdi.jpg

valerio-bozzolan avatar Dec 15 '25 14:12 valerio-bozzolan

Ah, I see you are using exactly that approach! Well done! Let's move the discussion in your merge request then.

https://github.com/openstreetmap/iD/pull/11666

valerio-bozzolan avatar Dec 15 '25 14:12 valerio-bozzolan

Further research is needed on how to get bearing values. (If possible at all.)

Good question. I've shared it here: https://www.mediawiki.org/wiki/Extension_talk:GeoData#Bearing

Anyway I guess the GeoData extension probably does not support bearing...

So...

If it's necessary to get the bearing, and/or filter by bearing, I guess the "new" Wikimedia Commons SPARQL query service could be a better tool.

https://commons.wikimedia.org/wiki/Commons:SPARQL_query_service

Example image with such structured info (click on the "structured data" tab):

https://commons.wikimedia.org/wiki/File:Lunenburg_-NS-_Lunenburg_Academy_edit.jpg

So, on Wikimedia Commons, these files have the bearing ("heading") expressed in this way:

<?file        | P1259 | ?coordinates>
<?coordinates | P7787 | ?heading>

Generic SPARQL documentation:

https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual#Search_within_box

API to get Wikimedia Commons files with bearing using SPARQL

https://w.wiki/GjSc 🥳

Example JavaScript code, generated by the endpoint:

class SPARQLQueryDispatcher {
	constructor( endpoint ) {
		this.endpoint = endpoint;
	}

	query( sparqlQuery ) {
		const fullUrl = this.endpoint + '?query=' + encodeURIComponent( sparqlQuery );
		const headers = { 'Accept': 'application/sparql-results+json' };

		return fetch( fullUrl, { headers } ).then( body => body.json() );
	}
}

const endpointUrl = 'https://commons-query.wikimedia.org/sparql';
const sparqlQuery = `SELECT ?file ?url ?location ?heading WHERE {
  SERVICE wikibase:box {
    ?file wdt:P1259 ?location.
    bd:serviceParam wikibase:cornerWest "Point(-121.872777777 37.304166666)"^^geo:wktLiteral;
      wikibase:cornerEast "Point(-121.486111111 39.575277777)"^^geo:wktLiteral.
  }
  ?file p:P1259 ?coordProperty.
  ?coordProperty ps:P1259 ?coordStatement.
  OPTIONAL { ?coordProperty pq:P7787 ?heading. }
  ?file schema:contentUrl ?url.
  BIND(IRI(CONCAT("http://commons.wikimedia.org/wiki/Special:FilePath/", wikibase:decodeUri(SUBSTR(STR(?url), 53 )))) AS ?image)
}
LIMIT 100`;

const queryDispatcher = new SPARQLQueryDispatcher( endpointUrl );
queryDispatcher.query( sparqlQuery ).then( console.log );

So happy hacking, also with the bearing! Bonus point: this also exposes structured data, already mentioned.

valerio-bozzolan avatar Dec 15 '25 18:12 valerio-bozzolan