mc2skos icon indicating copy to clipboard operation
mc2skos copied to clipboard

Add consistency check and inference with skosify

Open nichtich opened this issue 8 years ago • 5 comments

I am about to refactor Skosify to support use as module. We could integrate some of its functionality into mc2skos for instance to make sure that links have counterparts (related in both directions, broader/narrower...)

nichtich avatar Nov 14 '17 13:11 nichtich

Cool, I'm already using it as a package in one of my scripts: https://github.com/scriptotek/data_ub_tasks/blob/master/data_ub_tasks/data_ub_tasks.py#L211-L215

Of course one could just pipe data from mc2skos to skosify, but it takes quite a bit of time to serialize and deserialize large RDF files, so I'm open to adding e.g. a skosify consistency check within mc2skos.

danmichaelo avatar Nov 14 '17 15:11 danmichaelo

Skosify has many options that are don't needed so I would not support all of them. The following seem most useful in my opinion (see also https://seco.cs.aalto.fi/publications/2014/suominen-mader-skosquality.pdf):

  • expand topConceptOf <-> hasTopConcept

  • expand narrower <-> broader

  • expand related <-> related

  • check related and narrower/broader exclude each other (also transtitively: an ancestor should not be related also!)

  • check the same label is not used as prefLabel and altLabel for the same concept with the same language

  • check every concept must have exactly one prefLabel per language of the concept scheme

  • detect cycles

I'd like to enable these most common checks with one or two options (e.g --expand and --quality) instead of having to create a config file so the choice must be opinionated. An additional option (--skosify configfile) could allow for all of Skosify features.

nichtich avatar Nov 15 '17 08:11 nichtich

👍 for the two options!

When it comes to supporting a config file, it would be good if the same format could also be used by skosify directly.

danmichaelo avatar Nov 15 '17 16:11 danmichaelo

First part implemented in #45. Packing Python and dependencies drives me nuts but I managed to do it.

P.S: Also added option --skosify.

nichtich avatar Nov 21 '17 11:11 nichtich

I'm not going to implement the --quality option soon because it requires https://github.com/NatLibFi/Skosify/issues/52 and can also be done with option --skosify to some degree. You can close this issue after merge.

nichtich avatar Nov 22 '17 10:11 nichtich