EasierRDF icon indicating copy to clipboard operation
EasierRDF copied to clipboard

Reduce the jargon

Open dbooth-boston opened this issue 5 years ago • 37 comments

From https://lists.w3.org/Archives/Public/semantic-web/2019Jan/0002.html

It might be easier for complete newbies if plainer language was used:

  • Resource -> Thing
  • Predicate/property -> Relation

Then a statement would be:

Thing–Relation–Thing

I understand the heritage behind the current naming, but for a newbie the first hurdle is understanding that "resource" has a different meaning to the one in the dictionary, that it actually means "thing". The dictionary definitions of "predicate" and "property" also don't correspond to the center position of an RDF triple in my opinion, whereas the dictionary definition of "relation" does.

Consequences would be RDF becomes TDF, or simply DF, to avoid redundancy. URI unfortunately becomes UTI, though it could be shortened in a similar way to simply UI.

I know it likely won't be a popular idea, but if you're looking for the perspective of relative newbies that's one about jargon I can share.

dbooth-boston avatar Mar 19 '19 22:03 dbooth-boston

Funny, but no, please do not do that.

akuckartz avatar Mar 20 '19 00:03 akuckartz

If for the moment we forget renaming RDF and URI, you don’t find that

Thing–Relation–Thing

is simpler than

Resource–Predicate–Resource ?

Very curious what people think as this was my proposal.

anthonymoretti avatar Mar 20 '19 02:03 anthonymoretti

Please don't. It would be good to have a non-technical intro using graph terminology to introduce RDF terminology (and when we write introductions in text books, this is exactly what do) -- and we should --, but please don't touch the technical terminology (ever). "RDF resource" sounds somewhat misleading, though, and my non-technical term (before introducing "RDF resource") is "node" -- but this is actually an incorrect oversimplification, because properties can be subjects and objects of RDF statements. "Thing" would have similar connotations (from OWL), and then you end up with statements like that owl:Nothing is a Thing. So there is a good reason for keeping framework-specific terminology. IMHO, the only feasible work-around to avoid confusion with, say (language) resource or (grammatical) subject, etc., is to systematically use the terms "RDF resource" and "RDF subject".

chiarcos avatar Mar 20 '19 03:03 chiarcos

Because of punning anything can be considered an owl:Thing though, right? So because owl:Nothing is an owl:Class, and classes can be things due to punning, can't owl:Nothing also be an owl:Thing?

I'm no expert by any means, but I always thought owl:Thing was equivalent to rdfs:Resource. So any illogical sounding statements involving "Thing" would already have equivalent illogical sounding statements involving "Resource" it seems.

anthonymoretti avatar Mar 20 '19 08:03 anthonymoretti

The terminology will also depend on the level of the representation - if we're directly supporting n-ary relationships, something I think we should, then we may want to consider using terms relating to objects, properties and relationships. This would also lead to easier to learn serializations for data and rules. My understanding is that to reach the middle 33% of developers, we need something different from the existing RDF framework, albeit something formally built on top of that.

draggett avatar Mar 20 '19 09:03 draggett

I note that the Property Graph folks don't seem to have an agreed term for graph nodes with properties, although some have considered "entity". In Cognitive Psychology, the agreed term is "chunk". Minsky used the term "frame". I reckon that "object" or "thing" are good candidates.

draggett avatar Mar 20 '19 09:03 draggett

Do you mind expanding the point about the effect n-ary relationships might have on the terms?

I also find "object" familiar, but probably because of my experience with OO I think. "Entity" and "chunk" seem like jargon, whereas "thing" is used in everyday speech - something, nothing, anything, everything. Even taking a look at https://www.w3.org/TR/rdf11-concepts/#resources-and-statements, that section of text refers to things and relations.

anthonymoretti avatar Mar 20 '19 10:03 anthonymoretti

Do you mind expanding the point about the effect n-ary relationships might have on the terms?

At the RDF core, we only have triples with subject, predicate and object. In many cases, people think at higher level, e.g. things with properties, and relationships between things, it it is an unnecessary burden to have to mentally map this into RDF triples.

Some further observations: In principle, relationships between things can also be seen as thing valued properties of other things. Metadata on the relationship can then be considered as sub-properties. However, I think that people like to distinguish relationships and properties, so it is helpful to have a way to do so. A relationship with metadata annotations can be modelled as a thing with the annotations as properties. I am interested in how to support these distinctions without confusing newcomers.

A further observation is the connection to the Web of Things, where things are digital twins for sensors and actuators, and exposed to applications as software objects with properties, actions and events. Things have RDF identifiers as a basis for describing the kinds of things, their relationship to the context in which they reside, and the object model with which they are exposed to applications. It would be great to have a common terminology, e.g. things, properties and relationships, to which the web of things adds actions and events.

I think all this points towards a higher level serialization for data, models and rules, that is formally layered on top of the RDF core.

draggett avatar Mar 20 '19 17:03 draggett

I'm no expert by any means, but I always thought owl:Thing was
equivalent to rdfs:Resource. So any illogical sounding statements
involving "Thing" would >already have equivalent illogical sounding
statements involving "Resource" it seems. Yes, but having the term "thing" at the level of RDF makes the apparent
contradiction it conveys much more prominent than only having it within
OWL. I barely ever used owl:Thing in data modeling (in inferences, of
course), but an "RDF thing" would be rather ubiquituous.

Having that said, I would not object such a term in a meta language or
higher-level data model that builds on top of RDF, as long as it is
properly distinguished from RDF itself.

chiarcos avatar Mar 20 '19 18:03 chiarcos

So something similar, Dave, to the distinction in OWL between object properties and data properties? Using the terms I proposed you’d have this hierarchy:

Relations
    Thing relations
    Data relations

On “resources”, if “nothing is a thing” appears contradictory then “nothing is a resource” appears equally contradictory, just in my view of course.

anthonymoretti avatar Mar 20 '19 20:03 anthonymoretti

So something similar, Dave, to the distinction in OWL between object
properties and data properties? Using the terms I proposed you’d have
this hierarchy: Relations

Thing relations

Data relations

On “resources”, if “nothing is a thing” appears contradictory then
“nothing is a resource” appears equally contradictory, just in my view
of course. Strictly speaking, neither "nothing is a thing" nor "nothing is a
resource" are contradictory, just counterintuitive. But I did not
recommend to say "nothing is a resource", but "owl:Nothing is an RDF
resource". Not because the term is particularly good, but because it's
unambiguous.

BTW: Of course, data is not a thing, right?

chiarcos avatar Mar 20 '19 22:03 chiarcos

Yeah. Touching the current formal terminology is not a good Thing, and shouldn't be necessary. But conventions and agreements are good. And it is not unusual to have common synonyms used for practical realisations of formal treatments.

And I do find there are problems. Eg.: "Resource" has no intuition (for mortals). I would much prefer Thing (even if it is in OWL (too?)) - I don't like things like Object, because they are too concrete for me. But I see the problem that Resource is well-embedded, so I think we live with that.

However. What exactly should I call Relations when I talk to people? They may even have already seen people talk about Properties, Predicates, Edges, Arcs and a bucket-load of other things, I suspect. Anyway, aren't Properties Relations? Oh, no, Properties describe Relations - so that's very clear than, especially when apparently if I use it as such, it is a Predicate. Properties is wrong - it makes things look uni-directional. Predicates needs some logic background, or it makes no sense. Edges, Arcs, Orcs, etc. require a graph background. Relation has a shedload of good intuition.

Can't we (please) just use Relation, and agree to do so?

I have a bunch of other trivial-seeming things like this that I think require no actual work, and I think would help, that I will get around to posting soon, I hope.

HughGlaser avatar Mar 22 '19 10:03 HughGlaser

“Data is not a thing, right?” I don’t really know if this is an answer, but rdfs:Literal is a subclass of rdfs:Resource, so maybe the answer is yes?

I agree with Hugh’s points about relations, the term is self explanatory.

Is it really too much though to add rdfs:Thing, and maybe very slowly, e.g. 5-10 years, deprecate rdfs:Resource? We have a description framework and the most central concept is poorly described. After working with RDF for a long time of course you get used to it, but the discussion is about getting adoption, and “Resource” is immediate and pervasive jargon for newcomers.

anthonymoretti avatar Mar 23 '19 07:03 anthonymoretti

“Data is not a thing, right?” (...) maybe the answer is yes? I think so, too. But formulating an opposition between "Data relations"
and "Thing relations" implies that it is not.

chiarcos avatar Mar 23 '19 07:03 chiarcos

Yep, totally agree with you. I was showing what a direct mapping of OWL terms to these terms would look like.

If you took a logical approach to naming the initial model might be:

Relations
    Data relations

Then if you wanted to create a mutually exclusive and collectively exhaustive set of classes, like in OWL, you can create a complementary class:

Relations
    Data relations
    Other relations

Then you give the complement the name of the parent class, because they are essentially plain Relations, and make the parent class abstract, which leaves two concrete classes and no mention of “Thing”:

Relations
Data relations

anthonymoretti avatar Mar 23 '19 08:03 anthonymoretti

"Entity" was mentioned before. Any reason not to use established ER terminology (https://en.wikipedia.org/wiki/Entity–relationship_model)?

Then, we have the following terms: RDF Resource =: "Entity" (rdfs:Class =: "Entity type") rdfs:Property [missing] (instance of) owl:ObjectProperty =: "Relationship" (instance of) owl:DatatypeProperty =: "Attribute"

If we just keep "Property" in addition to ER terms, all is covered and we don't need to reinvent anything. Wrt. "Relationship", I would prefer to stay with "Relation", though.

In order to both establish a more consistent view and to keep that apart from the existing RDF ecosystem (which I would not touch), we can create a "lod:" namespace (or so) and define RDF, RDFS and OWL concepts as subclasses (or aliases) of these concepts. One advantage would be that this will remain fully backward-compatible, but if terminology is really that much of a problem, people will eventually move to the new namespace so that we can deprecate RDF and RDFS namespaces in something like, say, 10 years without ever breaking backward-compatibility.

Cf. #52

chiarcos avatar Mar 23 '19 09:03 chiarcos

Just my view but the ER model is still jargon.

Entity - Do people use “entity” or “thing” in everyday speech? I think it’s clear, but you can also check Google Ngram Viewer.

Relationship - I agree that “relation” is preferable.

Attribute - I think it’s important to use the definitions of words as found in the dictionary, and the definition of “attribute” in the dictionary doesn’t describe an OWL Datatype Property.

And yeah a new namespace would be great, even if only because the “r” in existing namespaces isn’t applicable anymore if a different term to “resource” is used.

anthonymoretti avatar Mar 23 '19 10:03 anthonymoretti

I think for correctness it might actually be “data item relation”, rather than “data relation”.

In full, what OWL is really describing are these:

Thing-to-thing relations
Thing-to-data-item relations

So, adding brevity and then following the same naming logic as before, you end up with these two concrete classes of relations:

Relations
Data item relations

anthonymoretti avatar Mar 23 '19 10:03 anthonymoretti

Just my view but the ER model is still jargon. It's definitely something you can find text books and tooling for. If it's
jargon, it's well-documented, at least, and it used to be wide-spread use
even before RDF emerged. Even nowadays, it isn't dead, but continues in
UML object diagrams.

Entity - Do people use “entity” or “thing” in everyday speech? If people use "thing" in everyday speech, they normally mean "there is
something I cannot give a more specific name right now". Calling anything
"thing" doesn't mean it's well defined (in natural language), it actually
means the opposite. This is why we have "something", "anything" and
"nothing", cf. https://www.merriam-webster.com/dictionary/thing: "an
object or entity not precisely designated or capable of being designated".

"Entity" isn't used as often, but if so then often as a technical term
with a clear definition as "abstract concept". This is Merriam-Webster's
sense 2: "something that has separate and distinct existence and objective
or conceptual reality "
(https://www.merriam-webster.com/dictionary/entity). I would call this a
match.

Relationship - I agree that “relation” is preferable. +1

Attribute - I think it’s important to use the definitions of words as
found in the dictionary, and the definition of “attribute” in the
dictionary doesn’t describe >an OWL Datatype Property. A very established term in computation is "attribute-value-pair". This is
where the term comes from. And this corresponds exactly to
Merriam-Webster's sense 1 (and 3, for boolean values):
https://www.merriam-webster.com/dictionary/attribute.

Out of curiosity: How do you define "jargon"? If it's "the technical
terminology or characteristic idiom of a special activity or group", then
we should not reduce it, but rather make sure our jargon comes close to
the one of a group that is significantly larger than the RDF community
and includes potential users of the technology.

ER would be a candidate (with applications in RDB and software
engineering), others would be graph terminology (with application in NoSQL
DBs), or UML (object diagrams, with applications in OO programming). In
any case, they should not be mixed. And we should definitely not make
up yet another new terminology.

chiarcos avatar Mar 23 '19 10:03 chiarcos

Whatever you come up with here, it's not gonna stick. Nor should it. Why don't you solve some real problems?

namedgraph avatar Mar 23 '19 11:03 namedgraph

First, in response to the topic, hell no.

There's already a term called ref:resource and if that isn't a good enough reason it's called (R)resource because it's the same R as in U(R)L, U(R)I, and I(R)I. It's a web resource. If you want to make things easier for people new to the technology throwing everything in the trash and replacing it with vague terms no one agrees on is the exact opposite of what you should be doing. A better approach would be to point out the historic context and use simplified analogies with the caveat that they are just that, simplifications. Historic baggage is everywhere, there's a reason I can still spell it color as colour and that there are accents on résumé.

This repository is titled EasierRDF and people are coming across this because they were confused and did a google search for "can someone make rdf easier". What str they going to think when they see it? "Wow, a decade later and they're still arguing some technically correct but ultimately pointless details of what terms mean. I'm just going to stick with Elasticsearch, Postgresql, Phoenix, Hive, Presto, Druid, Domeo, Hawk, Impala, Influx, MySQL, ArrangoDB, Neo4j, etc."

There really isn't that much jargon but if you want to keep arguing this thread for another hundred years here's TBox, ABox, Term, Context, Model, Punning. HttpRange14, Ontological commitment, Open world reasoning, and unique name assumption.

None of this helps your middling 30%ers. If they don't understand the terminology they actually read something about it. Your middle 30% moved on a long time ago. They are Don Draper in an elevator saying, "I don't think about you at all."

zacharywhitley avatar Mar 23 '19 12:03 zacharywhitley

The funny thing is I am that 30% developer that you describe. I didn’t understand the terminology and so I read something about it, and that took time that I’d like to try and save other developers from having to waste. I mean c’mon “predicates”? 😂

Look at this well written intro to OWL 2 that could describe RDF:

OWL 2 is “a Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things.”

  • https://www.w3.org/TR/owl2-primer/#Introduction

You could put that on a landing page it’s so simple. But replace the words in that sentence with current RDF terminology and it has very quickly lost its appeal.

I am “solving real problems” outside of this, Martynas 😂😂😂

anthonymoretti avatar Mar 23 '19 12:03 anthonymoretti

Getting started with RDF by considering that subjects are URIs and predicate some kind of URI and object can be some data types is very complicated. Considering that URIs are data types is strange.

IMO Datomic (which completly avoid the RDF vocabulary for some reason...) speak in term of Entity Attribute Value (which might be somewhat misleading).

I think that Identifier Key Value could be a good middle ground, it reuse existing software engineering vocabulary while being backward compatible with the original triple Subject Predicate Object.

amirouche avatar Mar 23 '19 13:03 amirouche

Changing what you call something isn't going to help you understand it. What exactly is the problem with the word "predicate"? You don't like that you had to read something to understand it? What technology are you working with where you don't need specialized terminology or have to read to understand it? There's nothing about the word "broker" that helps me understand Kafka. Or how about "a monad is just a monoid in the category of endofunctors"? to which people published articles titled "A Moniod is a Burrito" and were then countered with "A Monad is a Burrito and other Functional Myths". If other communities can flourish with this kind of naval gazing then I don't think "predicate" is really the problem. NLP uses the work predicate, Apache Camel has predicates. Guava has a predicate as well as vavr.

If you'd like better tutorials, videos, instructional material that presents things more explicitly and helps you understand the concepts faster than let's do that but changing the terminology is not going to help anyone understand the concepts any better and will most likely make things more confusing.

zacharywhitley avatar Mar 23 '19 14:03 zacharywhitley

@amirouche You're just replacing the names Subject -> Identifier, Predicate -> Key, and Object -> value with arguably inferior replacements. There is nothing key like in the predicate and the value can be another identifier as well as a literal and the Identifier identifies things as much as your key or possibly your value does.

If that mental mapping works for you then by all means use it. Datomic doesn't use RDF because it chose not to use that standard and is based on Datalog.(it actually uses a 5 tuple)

You think that Subject/Predicate/Object is too complicated, like Identifier/Key/Value, and think Entity/Attribute/Value is misleading?

zacharywhitley avatar Mar 23 '19 15:03 zacharywhitley

@anthonymoretti says:

Relationship - I agree that “relation” is preferable.

Why? Isn't that just a matter of US vs GB usage of English?

draggett avatar Mar 23 '19 17:03 draggett

There's already a term called ref:resource and if that isn't a good enough reason it's called (R)resource because it's the same R as in U(R)L, U(R)I, and I(R)I. It's a web resource.

First: as I said, I don't think agreeing to use a term other than "Resource" is a Good Idea. However, we need to accept that it is problematic, and not pretend (to newbies, when we talk to them) that it isn't.

Because: No, it isn't "a web resource". That's the whole point. If it was, then fine. But if it was, then we would still be using URL, but we aren't, we have changed to URI, IRI, or whatever. So a URI doesn't Locate a Resource, because that would be a stupid thing to do for abstract things that are not on the Web. So, for example, saying that the URI for the Cowardly Lion's courage is identifying a Resource really stretches the natural idea of a resource rather far.

But this is one where we just have to suck it up and take the hit.

HughGlaser avatar Mar 23 '19 22:03 HughGlaser

So what exactly is the problem?

zacharywhitley avatar Mar 23 '19 23:03 zacharywhitley

Why don't you solve some real problems?

Please, let's keep the conversation civil and constructive. We should be welcoming the ideas of newcomers who can look at this stuff with fresh eyes -- not flinging insults at them. IMO newcomer perspectives are the most valuable of all, because newcomers represent the target demographic of this effort. Experienced RDF users are not.

Difficulty of use, and confusing off-putting jargon certainly are very real problems in RDF. And the whole purpose of this discussion is to collect ideas for addressing them. Nobody expects all ideas to be adopted. But we need to get fresh ideas on the table -- the more the better -- in order to eventually figure out which ones we might want to pursue. We cannot do that by creating a climate of intimidation and elitism.

The jargon barrier is one that I had forgotten, since it has been so long since I faced it myself. I am glad that it was brought to our attention.

dbooth-boston avatar Mar 24 '19 02:03 dbooth-boston

Cheers David. That’s true, I’m definitely not expecting any or all ideas to be adopted, just want to put them out there.

Hugh explained it well, as far as I can tell URIs officially took on the broader meaning in 2004:

https://www.w3.org/TR/webarch/#id-resources

Even there, if you look at the third paragraph, the editors redefine “resource” to mean “thing”.

Guess I’m wondering why it would be so hard to do. Coming from iOS development I’m very familiar with deprecation, every year with each iOS release there are many deprecated APIs, it’s a fact of life for iOS developers, millions of us cope, and the frameworks improve over time. What’s different about the RDF ecosystem?

Dave, fair question. I’m no linguist, but I think it’s a subtlety between the words rather than US vs GB. If we take the Oxford dictionary, both “relation” and “relationship” start with the same definition:

“The way in which two or more concepts, objects, or people are connected.”

Then they differ slightly immediately after that:

Relation: “a thing's effect on or relevance to another.”

Relationship: “or the state of being connected.”

So, I could be wrong, but one seems more applicable to types of relationships, and the other to instances, and we’re after types.

anthonymoretti avatar Mar 24 '19 04:03 anthonymoretti