pyld icon indicating copy to clipboard operation
pyld copied to clipboard

It is not always relevant to silently ignore unrecognized properties

Open pchampin opened this issue 8 years ago • 4 comments

Consider the following code:

    CTX={ "foo": "http://example.com/foo"}
    expanded = jsonld.expand({"fooo":"bar"}, {'expandContext': CTX})

The result is an empty list, because "fooo" is not recognized by CTX. The fact that it is silently ignore can be relevant in some situations, but in not in others. In this example, it is probably a typo, which the user would probably like to be detected.

So I think there should be a way to be notified when some JSON attributes were ignored.

The simplest (but not flexible) way to do it would be to have a strict option, defaulting to False. When set to True, an exception would be raised whenever a JSON key could not be converted to a IRI. In my application, I would rather fail than silently drop some data submitted by the user.

A more flexible solution would be to have a droppedKeys option, expecting a set, and populating this set with every key that could not be converted to an IRI.

    CTX={ "foo": "http://example.com/foo"}
    dropped = set()
    expanded = jsonld.expand({"fooo":"bar"}, {'expandContext': CTX, 'droppedKeys': dropped})
    if dropped:
        # print some warning message to the user
    # but still process expended

pchampin avatar Oct 14 '16 07:10 pchampin

@dlongley thanks for your comments on fe7eb47. First, let me just point out what I should have mentionned in the PR: this is just a quick proof of concept. I understand that the actual solution should be thought over more thouroughly.

Regarding "strict mode", I agree that processingMode is a good candidate for playing this role.

Regarding dropped keys, I have indeed overlooked the fact that, with multiple contexts, a key may be converted in some places and dropped in others, so my fix is obviously not satisfactory. Storing the complete path would seem as a valid alternative, but looks harder to achieve in the current code.

I still believe that having a "strict" mode (under this name or another one) would still be a benefit, even if we don't come up with a solution for reporting dropped keys in "normal" mode.

pchampin avatar Oct 17 '16 06:10 pchampin

I believe that a stumbled over this, or maybe @vocab can be another way of "not strict". I expected this document and context...

    doc = {
        "@context": {
            "@base": "http://example.org/",
            "@vocab": "http://schema.org/",
            },
        "@id": "12345",
        "@type": "Blog",
        "title": "This is a blog title",
        }
    context = {
        }

to generate the same set of triples as this one...

    doc = {
        "@id": "12345",
        "@type": "Blog",
        "title": "This is a blog title",
        }
    context = {
        "@base": "http://example.org/",
        "@vocab": "http://schema.org/",
        }

but latter does not include the title.

JoelBender avatar Nov 08 '16 15:11 JoelBender

@JoelBender I just try your two examples, and I do get the same result...

from pyld.jsonld import expand
from pprint import pprint

data_w_ctx = {
    "@context": {
        "@base": "http://example.org/",
        "@vocab": "http://schema.org/",
    },
    "@id": "12345",
    "@type": "Blog",
    "title": "This is a blog title",
}

pprint(expand(data_w_ctx, { "expandContext": {} }))

data_wo_ctx = {
    "@id": "12345",
    "@type": "Blog",
    "title": "This is a blog title",
}

ctx = data_w_ctx['@context']

pprint(expand(data_wo_ctx, { "expandContext": ctx }))

(both with the pypi and github version of PyLD)

That being said, you are right: one way to implement "not strict" would be to append an artificial @vocab as a fallback context...

pchampin avatar Nov 08 '16 15:11 pchampin

@pchampin thank you, I misunderstood the options.

JoelBender avatar Nov 08 '16 16:11 JoelBender