pyld icon indicating copy to clipboard operation
pyld copied to clipboard

Dates from schema.org are not compacted correctly.

Open rknLA opened this issue 7 years ago • 3 comments

I've traced this down to https://github.com/digitalbazaar/pyld/blob/master/lib/pyld/jsonld.py#L4132

The original symptoms that I observed were that various Date fields that should be compacted to keys like birthDate were actually including a schema: prefix.

It looks like the core problem is that the default inverse dictionary for @language is not getting populated correctly (perhaps due to a change in the published @context from schema.org?)

Date field specifications include both @id and @type, but not language, and are still expected to be pure strings (according to schema.org's documentation). When the compactor attempts to find the correct term, [_select_term]() returns None.

The inverse dictionary winds up looking something like this:

{
    'http://schema.org/birthDate': {'@none': {'@language': {},
                                              '@type': {'http://schema.org/Date': 'birthDate'}}},
    # etc
}

Instead of containing a {'@none': 'birthDate'} for @language, there's just an empty dict.

Some other non-date fields also seem to exhibit this issue, but I don't know enough about the library or json-ld to know if these symptoms are actually problems, or if they're by design.

Minimum-reproducible sample:

#!/usr/bin/env python

from pyld import jsonld

doc = {
    'http://schema.org/name': 'Buster the Cat',
    'http://schema.org/birthDate': '2012',
    'http://schema.org/deathDate': '2015-02-25'
}

frame = {
    '@context': 'http://schema.org/'
}

framed = jsonld.frame(doc, frame)
contents = framed['@graph'][0]
print(framed)
assert 'name' in contents  # fine
assert 'birthDate' in contents  # not fine, schema:birthDate instead
assert 'deathDate' in contents  # not fine, schema:deathDate instead

My proposal to fix this would be to apply https://github.com/Artory/pyld/commit/faaa1394dc32ecfdd6fd875f45e4fdbc931fa7cc, to attempt to set these defaults regardless of the outcome of the conditionals there.

rknLA avatar Jul 18 '17 17:07 rknLA

I've run into this problem too. Unfortunately the patch breaks the tests, most of the failed tests are ordering differences and SSL errors, but some of them aren't. I'm not familiar enough with JSON-LD to know what is wrong with them though.

alantrick avatar Sep 20 '18 22:09 alantrick

@alantrick The tests have many failures now because the test suite and specs moved forward and the code hasn't caught up. Unless someone jumps in on this the code likely won't be updated until after the js library has also caught up.

As to the original problem, it can also be tested with just compaction. Here's the example with JSON quoting for easier playground use:

{
  "http://schema.org/name": "Buster the Cat",
  "http://schema.org/birthDate": "2012",
  "http://schema.org/deathDate": "2015-02-25"
}
{
  "@context": "http://schema.org/"
}

Note from the schema.org context, the lines:

        "@vocab": "http://schema.org/",
        "schema": "http://schema.org/",
        "Date": {"@id": "schema:Date"},
        "birthDate": { "@id": "schema:birthDate", "@type": "Date"},
        "deathDate": { "@id": "schema:deathDate", "@type": "Date"},
        "name": { "@id": "schema:name"},

The test input just has simple strings for the date values. If you change the input so the data has the expanded Date type, then the compaction will work (you can test this all with just compaction vs framing):

{
  "http://schema.org/name": "Buster the Cat",
  "http://schema.org/birthDate": {"@value": "2012", "@type": "http://schema.org/Date"},
  "http://schema.org/deathDate": {"@value": "2015-02-25", "@type": "http://schema.org/Date"}
}

Output:

{
  "@context": "http://schema.org/",
  "birthDate": "2012",
  "deathDate": "2015-02-25",
  "name": "Buster the Cat"
}

Schema.org does say to use a string ISO 8601 date, but in JSON-LD it still needs to be typed. I don't recall the reason it's matching on types to do the term compaction. I think that's how it's supposed to work but I'm unsure. @gkellogg Perhaps you can confirm that if you have a moment?

Here's a self contained simple compaction test: Input:

{
  "http://example.org/a": "A",
  "http://example.org/b": "B",
  "http://example.org/c": {"@value": "C", "@type": "urn:C"}
}

Context:

{
  "@context": {
    "ex": "http://example.org/",
    "a": {"@id": "http://example.org/a"},
    "b": {"@id": "http://example.org/b", "@type": "urn:B"},
    "c": {"@id": "http://example.org/c", "@type": "urn:C"}
  }
}

Output:

{
  "@context": {
    "ex": "http://example.org/",
    "a": {
      "@id": "http://example.org/a"
    },
    "b": {
      "@id": "http://example.org/b",
      "@type": "urn:B"
    },
    "c": {
      "@id": "http://example.org/c",
      "@type": "urn:C"
    }
  },
  "a": "A",
  "ex:b": "B",
  "c": "C"
}

davidlehn avatar Sep 21 '18 01:09 davidlehn

I don't recall the reason it's matching on types to do the term compaction. I think that's how it's supposed to work but I'm unsure.

When compacting, it's necessary to be sure that the value matches the type, which would include both @type and @language. Otherwise, if it compacted to that term and used a string value, it would not expand back to it's proper value.

ISO 8601 does allow a greater range of date styles than does xsd:date, which is fine, and nothing actually checks the value for conformance, but if the term days it's a schema:Date, then "2018" will expand to {"@value": "2018", "@type": "http://schema.org/Date"}, so it must be of that form to be re-compacted using the same term.

gkellogg avatar Sep 22 '18 19:09 gkellogg