a function to get labels based on ID
Hi,
For sequins I have been working on a get_label() function:
#' This function takes a component of a triple pattern as input and returns (if it exists) a corresponding human-readable label.
#' @param string the string (a part of a triple pattern) to label
#' @param language the language in which to return the label (defaults to "en")
#' @param endpoint the SPARQL endpoint that is being queried (defaults to "wikidata")
#' @param label_property the name of the labelling property, for instance "skos:prefLabel". Defaults to "rdfs:label". If the endpoint is one of the usual glitter endpoints (see glitter::usual_endpoints) the labelling property is set accordingly.
#' @return the label corresponding to the string
#' @export
get_label=function(string, language="en",endpoint="wikidata", label_property="rdfs:label"){
if(endpoint %in% glitter::usual_endpoints$name){
index_endpoint=which(glitter::usual_endpoints$name==endpoint)
label_property=glitter::usual_endpoints$label_property[index_endpoint]
}
if(!glitter:::is_prefixed(string)){
return(string)
}
string=glitter:::str_replace(string,
"(^wdt\\:)|(^p\\:)|(^ps\\:)|(^pq\\:)",
"wd:")
result=glitter::spq_init(endpoint=endpoint) %>%
glitter::spq_add(glue::glue("{string} {label_property} ?string_label")) %>%
glitter::spq_mutate(languages=lang(string_label)) %>%
glitter::spq_perform() %>%
dplyr::filter(languages==language) %>% # because I don't know how to make glitter::spq_filter work here
.$string_label
if(length(result)==0){return(string)}
return(result)
}
It's supposed to work on all endpoints but I'll admit that right now my only examples which make much sense are on Wikidata...
Examples:
get_label("wd:Q152088",language="en") # returns "French fries"
get_label("wd:Q152088",language="fr") # returns "frite"
get_label("wdt:P31", language="fr") #returns "nature de l'élément"
get_label("'David Bowie'") # returns "'David Bowie'")
get_label("?item") # returns "?item"
get_label("hal:structure",endpoint="hal") # returns 'hal:structure'
I'm wondering whether it should be included in glitter rather than sequins? What do you think?
It's supposed to work on all endpoints but I'll admit that right now my only examples which make much sense are on Wikidata...
Because other endpoints have readable properties?
Well, it would make sense if they did but I think they generally don't :-(. Maybe dbpedia could gather data about owl vocabularies? haven't had the time to check it though
In that sense (if it's only relevant for Wikidata) it's similar to some functions you just removed BUT on the other hand not that much because at least it's not based on external packages
could it live in a third package?
- glitter for query building
- sequin for query visualization
with Wikidata util?