pyld
pyld copied to clipboard
It is not always relevant to silently ignore unrecognized properties
Consider the following code:
CTX={ "foo": "http://example.com/foo"}
expanded = jsonld.expand({"fooo":"bar"}, {'expandContext': CTX})
The result is an empty list, because "fooo" is not recognized by CTX. The fact that it is silently ignore can be relevant in some situations, but in not in others. In this example, it is probably a typo, which the user would probably like to be detected.
So I think there should be a way to be notified when some JSON attributes were ignored.
The simplest (but not flexible) way to do it would be to have a strict
option, defaulting to False. When set to True, an exception would be raised whenever a JSON key could not be converted to a IRI. In my application, I would rather fail than silently drop some data submitted by the user.
A more flexible solution would be to have a droppedKeys
option, expecting a set, and populating this set with every key that could not be converted to an IRI.
CTX={ "foo": "http://example.com/foo"}
dropped = set()
expanded = jsonld.expand({"fooo":"bar"}, {'expandContext': CTX, 'droppedKeys': dropped})
if dropped:
# print some warning message to the user
# but still process expended
@dlongley thanks for your comments on fe7eb47. First, let me just point out what I should have mentionned in the PR: this is just a quick proof of concept. I understand that the actual solution should be thought over more thouroughly.
Regarding "strict mode", I agree that processingMode
is a good candidate for playing this role.
Regarding dropped keys, I have indeed overlooked the fact that, with multiple contexts, a key may be converted in some places and dropped in others, so my fix is obviously not satisfactory. Storing the complete path would seem as a valid alternative, but looks harder to achieve in the current code.
I still believe that having a "strict" mode (under this name or another one) would still be a benefit, even if we don't come up with a solution for reporting dropped keys in "normal" mode.
I believe that a stumbled over this, or maybe @vocab
can be another way of "not strict". I expected this document and context...
doc = {
"@context": {
"@base": "http://example.org/",
"@vocab": "http://schema.org/",
},
"@id": "12345",
"@type": "Blog",
"title": "This is a blog title",
}
context = {
}
to generate the same set of triples as this one...
doc = {
"@id": "12345",
"@type": "Blog",
"title": "This is a blog title",
}
context = {
"@base": "http://example.org/",
"@vocab": "http://schema.org/",
}
but latter does not include the title
.
@JoelBender I just try your two examples, and I do get the same result...
from pyld.jsonld import expand
from pprint import pprint
data_w_ctx = {
"@context": {
"@base": "http://example.org/",
"@vocab": "http://schema.org/",
},
"@id": "12345",
"@type": "Blog",
"title": "This is a blog title",
}
pprint(expand(data_w_ctx, { "expandContext": {} }))
data_wo_ctx = {
"@id": "12345",
"@type": "Blog",
"title": "This is a blog title",
}
ctx = data_w_ctx['@context']
pprint(expand(data_wo_ctx, { "expandContext": ctx }))
(both with the pypi and github version of PyLD)
That being said, you are right: one way to implement "not strict" would be to append an artificial @vocab
as a fallback context...
@pchampin thank you, I misunderstood the options.