glitter icon indicating copy to clipboard operation
glitter copied to clipboard

"Conjugation" of Wikidata properties

Open lvaudor opened this issue 2 years ago • 1 comments

Hi,

I'm not sure it's an "Issue" but I would like to have your input on that @maelle.

I'm working on package sequins and doing so I'm trying to get labels of properties. I have created a function get_label (different from what we had formerly implemented in glitter because I'm trying to use glitter itself to do so and not WikidataR).

get_label=function(string, language="en",endpoint="Wikidata", labelling_prop="rdfs:label"){
  if(!glitter:::is_prefixed(string)){
    return(string)
  }
  result=spq_init(endpoint=endpoint) %>% 
    spq_add(glue::glue("{string} {labelling_prop} ?string_label")) %>% 
    spq_mutate(languages=lang(string_label)) %>% 
    spq_perform() %>% 
    dplyr::filter(languages==language) %>% 
    .$string_label
  return(result)
}

I want to pick properties names directly from the triplet patterns of the glitter query so that I will have for instance "wdt:P31" to label. The thing is, "wdt:P31" does not have a label. "wd:P31" has. This has made me fully realize that Wikidata have this unique (I think?) feature of (kind of) conguging its properties based on their location or role in the triplet patterns. For instance: wd or wdt whether it's used as a subject/object or a verb, p, ps, pq for property qualifiers.

Would you agree with that way of seeing things? Have you encountered this kind of "conjugation" in another SPARQL endpoint?

On my way to replace "wdt:", "p:","ps:","pq:" with "wd:" in the get_label() function above, but I'd love to hear your thoughts on this ;-)

lvaudor avatar Oct 09 '23 09:10 lvaudor

wow, more grammar!

yes I think it makes sense that you're getting to the root of the property. It has a name in language processing: https://en.wikipedia.org/wiki/Stemming so you could call the internal function doing this replacement stem_property.

maelle avatar Oct 12 '23 09:10 maelle