Make our SPARQL queries compatible with non-Blazegraph endpoints
Description
The Wikidata endpoint is struggling more and more and even if one uses our shared SPARQL cache, one isn't unlikely to have ones build interrupted by timeouts. If the status quo persist we will need to move off WDQS at one point.
In the meanwhile I think we should aim to move our SPARQL queries away from WDQS/Blazegraph specific SERVICEs and syntax. For us I think this mainly means replacing the wikibase:label service. Such a move would make it not only easier to move to another endpoint one day but it will also make it easier to experiment with WDQS alternatives.
Should we also make sure we always have the right prefixes added?
No I don't think that should be a priority as most SPARQL backends support per-defined prefixes.
Turns out that the WDQS label service is very slow and that qlever for example can beat it with a bunch of OPTIONAL/FILTER in combination with COALESCE.
It will look rather verbose but it's better than nothing.
The holy grail might be to push the topic forward in the SPARQL 1.2 discussion, maybe I know someone who would be interested in collaborating on that.
Here is an example of what a query could look like if done using standard-SPARQL:
# expected_result_count: 47
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema: <http://schema.org/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT
?qid
?orgLabel
?orgDescription
?type
?typeLabel
?country
WHERE {
BIND(wd:Q212 AS ?country)
VALUES ?type {
wd:Q7878545 # ministries (20)
wd:Q3348196 # oblast (24)
wd:Q84823091 # region (1)
wd:Q5124045 # city with special status (2)
}
?org wdt:P31 ?type .
?org wdt:P17 ?country .
MINUS { ?org wdt:P576 [] }
MINUS { ?org wdt:P1366 [] }
BIND(REPLACE(STR(?org), "http://www.wikidata.org/entity/", "") AS ?qid)
OPTIONAL {
?org rdfs:label ?orgLabelMul .
FILTER(LANG(?orgLabelMul) = "mul")
}
OPTIONAL {
?org rdfs:label ?orgLabelEn .
FILTER(LANG(?orgLabelEn) = "en")
}
OPTIONAL {
?org rdfs:label ?orgLabelUk .
FILTER(LANG(?orgLabelUk) = "uk")
}
BIND(COALESCE(?orgLabelEn, ?orgLabelMul, ?orgLabelUk) AS ?orgLabel)
OPTIONAL {
?org schema:description ?orgDescriptionMul .
FILTER(LANG(?orgDescriptionMul) = "mul")
}
OPTIONAL {
?org schema:description ?orgDescriptionEn .
FILTER(LANG(?orgDescriptionEn) = "en")
}
OPTIONAL {
?org schema:description ?orgDescriptionUk .
FILTER(LANG(?orgDescriptionUk) = "uk")
}
BIND(COALESCE(?orgDescriptionEn, ?orgDescriptionMul, ?orgDescriptionUk) AS ?orgDescription)
OPTIONAL {
?type rdfs:label ?typeLabelMul .
FILTER(LANG(?typeLabelMul) = "mul")
}
OPTIONAL {
?type rdfs:label ?typeLabelEn .
FILTER(LANG(?typeLabel) = "en")
}
OPTIONAL {
?type rdfs:label ?typeLabelUk .
FILTER(LANG(?typeLabelUk) = "uk")
}
BIND(COALESCE(?typeLabelEn, ?typeLabelMul, ?orgLabelUk) AS ?typeLabel)
}
ORDER BY ?type ?orgLabel