ceptr
ceptr copied to clipboard
shared vocabularies - how do you create shared understanding?
Semantic Web / Linked Data use various shared vocabularies to created shared understanding. You can find simple centrally controlled vocabularies like http://schema.org as well as tons of ones created independently http://lov.okfn.org
How do you connect semantics in ceptr to human concepts and language?
Great question!
In ceptr, we have the concept of a "Compository" which is a receptor that acts as a shared storage hub, primarily of other Receptor specifications for composing with (i.e. building other receptors), but also for symbol/structure and process sets, i.e. your vocabularies. The ceptr compository is one of the prime example of the distributed shared receptors in ceptr. The compository might at first feel centralized because there will be just one CeptrNet address for it, but it's really not because pretty much every virtual machine host will be running an instance of it, holding, in a holographic sense, the parts of the compository that it needs, or simply volunteers to shares storage of. The protocols for adding items into the compository are tightly bound with the CeptrNetwork protocols, so this functionality is available to everyone at the lowest level, which makes it more like lov.okfn.org than schema.org. But we also expect some high-reputation categorization/tagging schemes (which will thus feel canonical) to emerge (and we will provide some initial ones) to make sense of all vocabularies that we expect to see added to the compository.
Incidentally we have already done some initially analysis of how to import XML/RDF/OWL type schemas and ontologies into the ceptr framework. For the most part, if they are designed well, they should be importable directly, but unfortunately one of the weak points of XML based semantic systems, is that it doesn't explicitly distinguish between structure and semantics. i.e. it doesn't make clear that semantics is the naming of a structural assemblage, whereas structures are the assemblage of semantic units. The prime example of this in XML is the perennial question of which form to use:
<item attr="x"></item> or <item><attr>x</attr></item>
So some ontologies that aren't consistent may be more difficult to import into ceptr.
Do you know that you can use RDF without using any XML? RDF model can get serialized to various formats. XML/RDF and RDFa based on XML but also Turtle and very recent JSON-LD
- http://www.w3.org/TR/2014/NOTE-rdf11-new-20140225/#section-serializations
- http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/
If you take a look at my draft of personal profile in JSON-LD, you can find information expressed using Schema.org, CCO, FOAF and soon I may use few more vocabularies. How do you express such basic information about a person in Ceptr?
https://github.com/elf-pavlik/webprofiled/blob/master/test/fixtures/perpetual-tripper/index.json
If I may jump into this discussion (because schema.org et al, as well as XML/JSON are of interest to me , as I've just landed an e-book contract with SyncFusion to write about the Semantic Web and Natural Language Processing ) I'd like to venture the following.
Taken from an example GeoNames RDF:
<wgs84_pos:lat>44.56387</wgs84_pos:lat> <wgs84_pos:long>6.49526</wgs84_pos:long>
But why was the XML implemented in this manner, rather than using attributes, for example:
Furthermore, we can glean (as humans) from the xmlns that it actually is conveying something semantic: the World Geodetic System version is "84" (as opposed to an earlier version, say "72").
Now that's interesting. They're encoding useful meaning as the namespace name. Granted, in XML this is done all the time because the backing schema can apply various validation facets, define further sub-structure, etc., that may vary from version to version.
There's another awkwardness here. Any XML element can have child elements, and unless you happen to also have the XSD, you really don't know if that's a possibility. On the other hand, XML attributes cannot have children -- this is why we use them, among other things, to state that this is a non-extensible attribute of the containing element (well, if I remember my NIEM training, there's some crazy things you can do to work around that.) But you get the idea.
Back to the topic -- I'm curious how this encoding is expressed in JSON -- does the "name" retain the fact that we're using the wgs84_pos schema? This is potentially vital information for the system to know, semantically, what type of lat/long we have here.
Personally (and I could be completely wrong here, but here goes) if I wanted to capture this information in its full semantic expressiveness, I might have something like this:
- GeoCode
- WGS84_GeoCode
- LatLong
- Lat
- Long
- Other semantic types
- LatLong
- WGS84_GeoCode
- Some other compliant schema (a semantic type) we want to support
Now there's some interesting things we can do here.
- If I need to work with LatLong's in the specific WGS84 format, I can check the semantic tree for the existence of "WGS84_GeoCode"
- If I don't care what the implementing schema is (hopefully the *numbers *don't change) I can query the semantic tree simply for "LatLong" -- I don't care who implements it, I just want the values.
So, if I'm anywhere near the mark (no pun intended) I think you can see the capability of working with a semantic tree rather than what I consider to be an awkward mix of namespaces, elements and possibly attributes to capture the semantics.
On the other hand, because JSON is much simpler than XML, being name-value pairs and hierarchical with collections, it seems more "naturally" suitable in expressing semantic trees (XML is certainly suitable, one just has to be more conscious, haha). So what would the GeoNames example look like in JSON?
Marc
On Tue, Sep 16, 2014 at 4:28 PM, ☮ elf Pavlik ☮ [email protected] wrote:
Do you know that you can use RDF without using any XML? RDF model can get serialized to various formats. XML/RDF and RDFa based on XML but also Turtle and very recent JSON-LD
http://www.w3.org/TR/2014/NOTE-rdf11-new-20140225/#section-serializations
- http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/
If you take a look at my draft of personal profile in JSON-LD, you can find information expressed using Schema.org, CCO, FOAF and soon I may use few more vocabularies. How do you express such basic information about a person in Ceptr?
https://github.com/elf-pavlik/webprofiled/blob/master/test/fixtures/perpetual-tripper/index.json
— Reply to this email directly or view it on GitHub https://github.com/zippy/ceptr/issues/1#issuecomment-55808392.
Marc www.marcclifton.com
Why do we talk about XML? As I explained one can happily use RDF without ever using XML, myself I tend to use mostly JSON-LD and Turtle plus RDFa if I want to embed RDF graph in HTML. Can we try to stick to the original topic here and I would propose to create another issue/thread to discuss peculiarities of XML :smile:
Pavlik,
I like your example of a personal profile, and I'll use it as an example when I write the Semantic Coherence section of the Ceptr Apocalypse doc, which will hopefully answer this question. I'll try to remember post a link here when I get that written.
-art
thx @artbrock !
@zippy @cliftonm please check out this beta workbench http://generator.geoknow.eu/videos.html
IMO it shows some interesting capacity enabled by Linked Data :globe_with_meridians:
Thanks Pavlik. Will do. I've also been looking at the JASON-LD serialization of RDF.
I think part of what's odd about add this, is that it seems that the RDF tripple model is what you do when you have a ton of data (the subject of the tripple) that's not semantic and you want to weave meaning net on top of that. But in ceptr all data carries semantics. Like Art said, we'll get examples up to show how it works in our worlk.
@zippy in RDF subject --{predicate}--> object
- subject: URI (or blank node)
- predicate: URI
- object: URI or Literal (or blank node)
so objects allow non semantic values, which makes sense to capture labels, numeric values and similar
eg in Turtle
@prefix schema: <http://schema.org/> .
@prefix github: <https://github.com/> .
@prefix ex: <http://example.net/#> .
gh:elf-pavlik a schema:Person ;
schema:name "elf Pavlik" ;
ex:shoeSize 41 ;
schema:knows gh:zippy ,
gh:artbrock .
same in JSON-LD, just with http://schema.org/ defined as default @vocab
{
"@context": {
"@vocab": "http://schema.org/",
"github": "http://github.com/",
"ex": "http://example.net/#"
},
"@id": "gh:elf-pavlik",
"name": "elf Pavlik",
"ex:shoeSize": 41,
"knows": [
"gh:zippy",
"gh:artbrock"
]
}
it gets really interesting when you want to capture multiple languages
@prefix db: <http://dbpedia.org/resource/> .
@prefix schema: <http://schema.org/> .
db:Apple schema:name "Apple"@en ,
"Pomme"@fr ,
"تفاح_مستأنس"@ar ,
"Yapolo"@fj ,
"Яблуко"@uk .
{
"@context": {
"@vocab": "http://schema.org/",
"name": { "@container": "@language" }
},
"@id": "http://dbpedia.org/resource/Apple",
"name": {
"en": "Apple",
"fr": "Pomme",
"ar": "تفاح_مستأنس",
"uk": "Яблуко"
}
}
using http://www.w3.org/TR/json-ld/#string-internationalization i suggest trying it out: http://json-ld.org/playground/
Q: How do you handle multiple languages in ceptr?
We're clearer about how to represent our structural semantics (in semantic trees in the compository), but still need to finish nailing down our relational assertion model (in an instantiated space).
Example: Arthur Brock (in some human identifying semantic tree vocab shared in compository) is the "author of" (stored in some instantiated receptor for sharing documents) this document (defined in compository vocab).
Together with @fosterlynn @bhaugen and @ahdinosaur currently we work in quite systematic way on https://valueflo.ws/ Lately also @Connoropolous and other people start engaging more with this work!
If you feel like having a call together in next days/weeks to sync-up, I would feel very happy to help with making it happen :hand: