dstlr icon indicating copy to clipboard operation
dstlr copied to clipboard

Refactoring relation and fact schema

Open lintool opened this issue 6 years ago • 3 comments

Currently, our schema for relations and facts looks something like this:

Screen Shot 2019-09-27 at 5 37 56 AM

There's an asymmetry here, as relations are reified with an explicit relation node. We should refactor to make more consistent.

(also, to me, object_of relation has the directionality reversed)

lintool avatar Sep 27 '19 09:09 lintool

In introducing an intermediate relation node for the ground-truth, do we want the type to be CITY_OF_HEADQUARTERS (CoreNLP) or P159 (Wikidata)?

An argument for CITY_OF_HEADQUARTERS is that the queries are cleaner as we can match on nodes of the same type, but against would be that we lose the Wikidata property information (does this even matter?).

An argument for P159 is that we maintain the Wikidata property and can map back and see where it came from, but the queries are messier because we need to know, and include, the mapping between CoreNLP <-> Wikidata.

ryan-clancy avatar Sep 27 '19 12:09 ryan-clancy

I'm leaning to P159. This leaves open the possibility that a relation might not align perfectly with a fact property, so we can't do this mapping up front.

So, just to be concrete, the tweak we are suggesting is to take fact (Q355 "Facebook", hq, Q74195 "Menlo Park") and create:

  • (Q355, subject-of, FACT[type:hq])
  • (Q74195, object-of, FACT[type:hq])

This also allows the mention "Menlo Park" in text to be linked to Q74195.

lintool avatar Sep 27 '19 13:09 lintool

And furthermore, I would change to has-subject and has-object to make sure that it is obvious that the FACT or RELATION should be in the first place in the triple.

lintool avatar Sep 27 '19 13:09 lintool