schemarama icon indicating copy to clipboard operation
schemarama copied to clipboard

Restore demos, document their config API and how to set up static-hosted and docker installations

Open danbri opened this issue 4 years ago • 2 comments

We should have documentation showing how to set this up for (a) static serving (b) docker serving, and a live installation of at least one of these.

Background

The original demos used server side processing for 3 things:

  • headless browser, so that JS-injected markup can be extracted
  • URL fetching, since a web page can't retrieve arbitrary URLs
  • Some simple python/Flask to map URL paths to file paths, HTML template-based generation, and also to JSON-ify a representation of the examples.

The whole thing can be run as a docker container but it would be good to have a simplified pure static version that could be run by anyone very easily. To do this:

  • [ ] convert file-mapping and JSONifying steps into pre-publication file management i.e. serve the shacl, shex, Turtle and JSON files the same way we serve up JS, CSS and HTML statically.
  • [ ] Make a simple HTML page from templates/scc.html etc.
  • [ ] Document the apps configuration API i.e. the files it needs to load, and which are currently served by app.py and basic_app.py Flash python server example.

I made a first attempt at the documentation below.

Draft documentation

Config API

SchemaramaJS configures itself with various files loaded from relative URIs:

  • /shacl/shapes - an aggregation of SHACL shape definitions.
  • /shacl/subclasses - a Turtle file listing rdfs:subClassOf relationships.
  • /shex/shapes - an aggregation of ShEx shape definitions.
  • /hierarchy - a JSON description that groups shape definitions in a hierarchy of associated services/projects.
  • /services/map - JSON associating services from hierarchy with shapes and patterns from SHACL and ShEx.
  • /tests - a JSON list of tests, where each is a piece of text using JSON-LD, RDFa, Microdata.

It will also typically serve icons associated with the hierarchy of services, e.g. initial demo uses:

  • /static/images/services/Schema.png
  • /static/images/services/ServiceA.png
  • /static/images/services/ServiceB.png
  • /static/images/services/ServiceBProduct1.png
  • /static/images/services/ServiceBProduct2.png
  • /static/images/services/ServiceBProduct3.png
  • /static/images/services/ServiceC.png
  • /static/images/services/ServiceD.png

Config details

The original demo shows a mix of shapes - some basic structures from Schema.org's definitions, and some associated with example online services. SchemaramaJS will try to load these upon initialization.

/shacl/shapes

This can be quite large, e.g. looking at headers using

curl -s -D - -o /dev/null http://127.0.0.1:3002/shacl/shapes

Content-Disposition: inline; filename=full.shacl
Content-Type: application/octet-stream
Content-Length: 223194

We get a large dump of SHACL in RDF/Turtle syntax.

/shacl/shex

Similarly, here we are served (in demo configuration):

HTTP/1.0 200 OK
Content-Disposition: inline; filename=full.shexj
Content-Type: application/octet-stream
Content-Length: 633692
Last-Modified: Wed, 09 Mar

Similarly, for the ShEx version we get a large dump of ShEx in ShExJ syntax.

/shacl/subclasses

curl -s -D - http://127.0.0.1:3002/shacl/subclasses

This data file reproduces rdfs:subClassOf assertions from relevant schemas. It is in Turtle format, and is not tightly linked to SHACL, except by the fact that only the SHACL validator uses it; it is not passed to ShEx validator during setup. In principle it could be used for other purposes, and we could change the file/url path accordingly.

In demo configuration, it is every subtype-supertype relationship defined in schema.org (and therefore note sometimes a type has multiple supertypes). Here are the lines relating to the ComedyClub type:

curl -s -D - http://127.0.0.1:3002/shacl/subclasses | grep ComedyClub

schema:ComedyClub rdfs:subClassOf schema:Place .
schema:ComedyClub rdfs:subClassOf schema:EntertainmentBusiness .
schema:ComedyClub rdfs:subClassOf schema:Organization .
schema:ComedyClub rdfs:subClassOf schema:LocalBusiness .
schema:ComedyClub rdfs:subClassOf schema:Thing .

/hierarchy

SchemaramaJS loads a JSON configuration file defining a hierarchy of services/applications that can be associated with the various validations being checked. In turn this file can include image URLs.

Demo config is this:

{
  "nested": [
    {
      "service": "ServiceA"
    },
    {
      "nested": [
        {
          "service": "ServiceBProduct1"
        },
        {
          "service": "ServiceBProduct2"
        },
        {
          "service": "ServiceBProduct3"
        }
      ],
      "service": "ServiceB"
    },
    {
      "service": "ServiceC"
    },
    {
      "service": "ServiceD"
    }
  ],
  "service": "Schema"
}

/services/map

SchemaramaJS also uses a JSON service mapping file, which associates validation shapes (named in common across SHACL and ShEX) with the services described in /services:

{
  "ValidSchemaAboutPage": "Schema",
  "ValidSchemaAcceptAction": "Schema",
  "ValidSchemaAccommodation": "Schema",
  "ValidSchemaAccountingService": "Schema",
  "ValidSchemaAchieveAction": "Schema",
  "ValidSchemaAction": "Schema",
  "ValidSchemaActionAccessSpecification": "Schema",
  "ValidSchemaActionStatusType": "Schema",
  "ValidSchemaActivateAction": "Schema",
  "ValidSchemaAddAction": "Schema",
  "ValidSchemaAdministrativeArea": "Schema",
  "ValidSchemaAdultEntertainment": "Schema",
  "ValidSchemaAggregateOffer": "Schema",
  "ValidSchemaAgreeAction": "Schema",
  "ValidSchemaAirline": "Schema",
  "ValidSchemaAirport": "Schema", [...etc etc...]
  "ValidSchemaWriteAction": "Schema",
  "ValidSchemaXPathType": "Schema",
  "ValidSchemaZoo": "Schema",
  "ValidServiceBRecipe": "ServiceB",
  "ValidServiceBProduct1Recipe": "ServiceBProduct1",
  "ValidServiceBProduct2Recipe": "ServiceBProduct2",
  "ValidServiceBProduct3Recipe": "ServiceBProduct3",
  "ValidServiceARecipe": "ServiceA",
  "ValidServiceDRecipe": "ServiceD",
  "ValidServiceCRecipe": "ServiceC" 
}

/tests

Finally, SchemaramaJS loads a collection of example tests, each is an appropriately escaped text value, structured in a very plain JSON file:

{ 
  "tests": [ 
     "escaped markup here e.g. json-ld...", 
     "second example here e.g. microdata..." 
  ]
}

No additional metadata is included; SchemaramaJS will try to figure out how to parse it.

Config-using Validator code

These files are all loaded by static/js/scc/core.js:

$(document).ready(async () => {
    $.getJSON("https://api.ipify.org/?format=json", function(e) {
        ip = e.ip;
    });
    await $.get(`shacl/shapes`, (res) => shaclShapes = res);
    await $.get(`shacl/subclasses`, (res) => subclasses = res);
    await $.get(`shex/shapes`, (res) => shexShapes = JSON.parse(res));
    await $.get(`hierarchy`, (res) => {
        hierarchy = res;
        constructHierarchySelector(hierarchy, 0);
    });
    await $.get(`services/map`, (res) => shapeToService = res);
    $.get(`tests`, (res) => initTests(res.tests));
    shexValidator = new schemarama.ShexValidator(shexShapes, {annotations: annotations});
    shaclValidator = new schemarama.ShaclValidator(shaclShapes, {
        annotations: annotations,
        subclasses: subclasses,
    });
});

danbri avatar Mar 18 '22 14:03 danbri

Started a rough script that copies things into the right place in an ephemeral "_serving" folder.

  • it takes everything from demo/static/
  • bit of care about symlinks between the js/ directories and core/dist/ originals
  • it deals with the 6 "Config API" URL paths, either copying original files or for the JSON, inline samples.
  • code is at https://gist.github.com/danbri/0c26bd3dedfcc66e73fb5f2b4c682e3e

danbri avatar Mar 18 '22 16:03 danbri

Possible diagnosis and fix for this not running: we're using very simple static HTTP servers that aren't sending the right media type headers for things that are in JSON (or any format for that matter).

I tried

    $.get(`tests`, (res) => { 
        let jres = $.parseJSON(res);    
        initTests(jres.tests)
    });

... in core.js line 39 and it seems to work.

Another gotcha, the demo assumes at least 3 tests will be sent from /tests, currently.

danbri avatar Mar 18 '22 17:03 danbri