schemato
schemato copied to clipboard
Schema.org definition not pulled correctly
The definition of schema.org is not being pulled correctly, meaning that any page with that markup will fail to parse.
The issue seems to be that the lexicon defined in the class MicrodataSchemaDef is fragile as stated by the TODO "certainly not the best way to do this, should probably use the pyRdfa/pyMicrodata graph APIs to make this more robust"
Right now it needs to change the domain to "http://schema.org/domainIncludes" and the range to "http://schema.org/rangeIncludes".
However, with these changes there are still some types that are not included. One of these types is ListenAction.
Thanks for reporting this. I've changed the domain and range for schema.org validators in this branch.
The failure to recognize ListenAction
is caused by the design of this code, which only registers a class from the schema.org schema if it is listed as the domain of a property. Since ListenAction
(and other Action
subclasses) have no non-inherited properties, there are no properties in the schema that list it as their domain. As a result, schemato doesn't realize that they exist.
The master branch now understands ListenAction
and other Action
subclasses after this change.
The remaining related issue is evaluation of the rdflib graph API to simplify the logic related to the lexicon
property, and hopefully to make that entire construction obsolete.
Please test your ListenAction
case against master and comment here on how it goes.
Apparently the schema.org definition file has changed in some subtle way that causes it not to parse. This function now returns an empty graph when asked to parse this file, rendering schema.org validation unusable.
Looking further into this, it seems that deleting the cache file schemaorg_schemadef.smt
actually fixes the problem.