Ontology and data modeling description is flawed
Learning module 1 contains seriously flawed descriptions of data models and ontologies:
Key terms: "Data model - A digital representation of a physical system that communicates connections between data points and structures.
A data model is a representation of data used (processed, communicated and stored) by computing systems, including documents and messages related to physical and virtual (digital twin) systems.
Data model parts - every data model includes the following parts: Entities, Entity types, Properties, Relationships
Data models include Entities, Entity types, and Properties.
- An entity is a Type name, like Temperature or AirHandler.
- An entity type defines the Type (not class) of the named entity, e.g., Temperature is an Integer with unit deci-degrees C. An instance of a Type is a Value, e.g. "31.7".
- Properties describe the data fields of an entity: AirHandler has properties "set_point" of type Temperature, "inlet_temp" of type Temperature, and "run_state" of type Boolean.
Note: Class and Type, and their instances object and value, have conflicting usage across the industry. But classes define behavior while types do not. Programs have classes, while documents and messages do not. Instances of a type that are the same value are equivalent, while objects (instances of a class) are never equal even if they have the same state. JSON contributes to the confusion, it calls {"a": 3} an "object" rather than the "value" of an associative array/map/dict type.
Example instance where Entity A (AirHandler) contains Entities B, C, D (set_point, inlet_temp, run_state):
{
"air_handler_13: {
"set_point": 32,
"inlet_temp": 28.4,
"run_state": true
}
}
A data model does not include Relationships that connect two entities, such as an AirHandler that "feeds" a BuildingManager by sending and receiving messages defined by the Data model.
Different types of data models depict varying levels of detail about a system and abstraction of its data.
True - concrete models define a specific data format, abstract models (information models) define document/message content independently of data format, but conceptual models may contain Type names without defining the Properties of that type (a model defines the fact that a type called AirHandler exists but contains no additional detail).
Why bother with an ontology? - Naming conventions and taxonomies are established by an ontology.
False: Taxonomies can be established by taxonomies without an ontology. Data models can be established by concrete data models and abstract information models without an ontology. An ontology can copy those models, but it is more efficient to use the ontology to define relationships between resources rather than the full level of detail within a resource. An ontology would typically not be used to define the content of an IP packet or an image file, or a Haystack Xeto type, and using Turtle serialization for instances of those types would be very inefficient.
In graph terms:
- A taxonomy is a tree with only "contain" edges (see https://www.biologyonline.com/dictionary/taxonomy) and only one container of any type.
- A data/information/conceptual model is a directed acyclic graph (DAG) with "contain" and "reference" edges, and unlike a taxonomy it allows Type reuse in more than one property.
- An ontology is a mesh with unconstrained edge ends and arbitrary predicates beyond "contains" and "references", e.g., "feeds".
Ack on the definition of the data model. I see your point and agree with your differentiation. We can add that to our next round of documentation updates.
I don't understand your point about the data model not containing relationships. Our data model does include relationships. Even if your point is that relationships apply to entity instances rather than entity types, it's still true that the concept of relationships is defined by the data model.
I am also confused by your last point: are you merely pointing out that, technically, a taxonomy is not an ontology by default? That you can have a naming convention without it rising to the standard of an ontology? If so, fair enough. We can change that sentence to something like "ontologies can take a lot of time to develop from scratch".
"Contain" relationships are structural - they are built into schema languages like XSD (elements contain other elements, including dedicated elements like <sequence>) and JSON Schema (objects and arrays contain properties and items). But schemas don't have built-in structures like "friend of" or "feeds", because there are an infinite number of predicates - they are defined by the user, not the schema language. I meant to say "data models don't have user-defined relationships" - those other than contain and reference, and "is a" which is simply the relationship between instance and type - instance John "is a" type Person.
For the last point, an ontology could limit itself to just "contains" and "is a" predicates, in which case it would be a taxonomy. But that's an inefficient way of representing "objects" that automatically have a "type" and "properties", and it requires ontology software to interpret the data, as opposed to simple JSON data that an application can just read into, e.g., a Python dict variable.