json-ld-syntax `@protected` creates unresolvable conflicts when the same term is defined in two contexts top-level

I've just encountered issue #424 (and the related #361 as well) and in a similar situation with https://www.w3.org/ns/controller/v1 defining alsoKnownAs top-level alongside @protected: true, while https://www.w3.org/ns/activitystreams defines alsoKnownAs in a different namespace (as: vs sec:, loosely)

From controller/v1:

{
  "@context": {
    "@protected": true,
    "id": "@id",
    "type": "@type",

    "alsoKnownAs": {
      "@id": "https://w3id.org/security#alsoKnownAs",
      "@type": "@id",
      "@container": "@set"
    },
//...

From activitystreams:

{
  "@context": {
    "@vocab": "_:",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "as": "https://www.w3.org/ns/activitystreams#",
// ...
"alsoKnownAs": {
      "@id": "as:alsoKnownAs",
      "@type": "@id"
    }
// ...

Putting activitystreams before controller/v1 causes the later definition to override the older one, as expected (but not as desired):

{
  "@context": ["https://www.w3.org/ns/activitystreams", "https://www.w3.org/ns/controller/v1"],
  "type": "Person",
  "id": "http://person.example",
  "alsoKnownAs": "https://person.example"  // sec:alsoKnownAs
}

[
  {
    "https://w3id.org/security#alsoKnownAs": [  // should be https://www.w3.org/ns/activitystreams#alsoKnownAs
      {
        "@id": "https://person.example"
      }
    ],
    "@id": "http://person.example",
    "@type": [
      "https://www.w3.org/ns/activitystreams#Person"
    ]
  }
]

But putting activitystreams after controller/v1 triggers the error due to @protected: true:

{
  "@context": [
"https://www.w3.org/ns/controller/v1",  // uses @protected
"https://www.w3.org/ns/activitystreams"  // will trigger the redefinition error
],
  "type": "Person",
  "id": "http://person.example",
  "alsoKnownAs": "https://person.example"
}

jsonld.SyntaxError: Invalid JSON-LD syntax; tried to redefine a protected term.

JSON-LD 1.1 4.1.11 Protected term definitions https://www.w3.org/TR/json-ld11/#protected-term-definitions describes two exceptions. The first exception is when the definition is the same, which is not applicable here. The second exception is for property-scoped context definitions, which is unworkable because in this case the singular top-level object is intended to be both an Actor as well as a Controller Document.

To veryify, here's a type-scoped context definition that errors out:

{
  "@context": [
    "https://www.w3.org/ns/controller/v1",
     {
       "Person": {
         "@id": "https://www.w3.org/ns/activitystreams#Person",
         "@context": {
           "alsoKnownAs": {  // triggers the redefinition error
             "@id": "https://www.w3.org/ns/activitystreams#alsoKnownAs"
           }
         }
       }
     }],
  "type": "Person",
  "id": "http://person.example",
  "alsoKnownAs": "https://person.example"
}

And to reiterate, a property-scoped context definition can't be used because the alsoKnownAs property is top-level. So the way I see it, there's nothing that can be done to resolve this in a "plain JSON" compatible way except:

a) convince whoever is responsible for controller/v1 to remove @protected: true
b) convince whoever is responsible for controller/v1 to redefine alsoKnownAs with the activitystreams-namespaced @id instead of the security-namespaced one
c) write my own context document

This leads me to think that @protected is a generally poorly-thought-out mechanism that highly increases the likelihood of such conflicts. Without it, as a producer I could just redefine the term later, for example by putting the activitystreams context last, or by using a local context object that comes after both remote contexts:

{
  "@context": [
  "https://www.w3.org/ns/controller/v1",  // needs to remove @protected
  "https://www.w3.org/ns/activitystreams"  // as:alsoKnownAs will override controller/v1's sec:alsoKnownAs
],
  "type": "Person",
  "id": "http://person.example",
  "alsoKnownAs": "https://person.example"  // as:alsoKnownAs
}

or

{
  "@context": [
"https://www.w3.org/ns/activitystreams",  // defines as:alsoKnownAs
"https://www.w3.org/ns/controller/v1",  // redefines sec:alsoKnownAs as @protected 
{
"alsoKnownAs": {
  "@id": "https://www.w3.org/ns/activitystreams#alsoKnownAs",  // won't work unless controller/v1 removes @protected
  "@type": "@id"
}
}],
  "type": "Person",
  "id": "http://person.example",
  "alsoKnownAs": "https://person.example"  // as:alsoKnownAs
}

I'm not sure the existence of @protected accomplishes its stated goal of "prevent[ing] this divergence of interpretation", nor that the rationale "that "plain JSON" implementations, relying on a given specification, will only traverse properties defined by that specification" is sufficiently addressing the issue of conflicts (or that it is a valid assumption in the first place). The issue arises when two specifications define the same term, and both specifications apply to the current object or document. It effectively leads to a hard incompatibility where it is impossible to implement both specs fully; you have to pick between them.

If there's an option I'm not aware of I'd like to hear it.

Oct 19 '24 09:10 trwnh

There's a typo in the controller document v1 context and it should instead use the activitystreams vocab for alsoKnownAs. A bug fix will address this particular case.

That being said, the whole point of protection is to enforce a particular term definition in a particular place when a particular context is present. So it is not a bug that it is doing this, but a feature. And it does require coordination to share terms across contexts in the same place (by ensuring the term definitions match). That's a requirement for the feature to work. You can only use other term definitions when you bring in a property-scoped context (as mentioned), because decentralized extensibility (in this case, reuse of the same term with a different definition) is only considered safe in different areas of the JSON tree in the same document.

Of course, if specs and / or implementations allow for JSON-LD compaction to be performed, then significantly more flexibility is possible. All of these designs are around finding a balance for different kinds of consumers in a sufficiently large decentralized ecosystem, some who will only accept static documents and others who might use compaction prior to consumption. This of course creates constraints.

Oct 19 '24 21:10 dlongley

the whole point of protection is to enforce a particular term definition in a particular place when a particular context is present. So it is not a bug that it is doing this, but a feature. And it does require coordination to share terms across contexts in the same place (by ensuring the term definitions match). That's a requirement for the feature to work.

If I'm reading this correctly, are you saying that two context authors are required to coordinate whenever there is a term conflict? This seems unworkable given the open-world assumption. If any single context author decides to make their term definition(s) @protected, then this creates problems for anyone else who defines the term differently. Essentially, one author doing it means that this author gets supremacy over the "plain JSON" and that their context declaration needs to come last or else the JSON-LD parser will throw a redefinition error. Two authors doing it will create an unresolvable error.

It seems to me like this unnecessarily makes things way more complicated for polyglots or anyone wanting to implement multiple overlapping specs. If for example schema.org decided to protect their context, it would become impossible to use both activitystreams and schema.org, because numerous top-level properties like name are shared across both contexts. A developer producing documents with "@context": ["https://schema.org", "https://www.w3.org/ns/activitystreams"] in this example would be creating irreconcilably unprocessable JSON-LD documents, because as:name is seen as a redefinition of schema:name. This means that either the developer will be forced to write their own context document (even if they don't understand JSON-LD), or that some downstream consumer will have to postprocess the unprocessable JSON-LD to replace the context with their own corrected one.

I don't see a situation that can possibly work smoothly so long as anyone uses @protected. If the aim is to ensure that terms don't get redefined, then this feels like a backfire because the actual result is that the entire document becomes unprocessable; instead of not understanding some number of redefined terms and having them appear to be missing ("I can't find schema:name, I only have as:name, but all the other schema: properties are as expected"), you end up not understanding the entire document ("my parser is giving me an error, I can't do anything with this unless I replace their context with what I am guessing they meant").

Oct 20 '24 04:10 trwnh

@trwnh,

Apologies, I would have written a shorter response if I had more time.

If I'm reading this correctly, are you saying that two context authors are required to coordinate whenever there is a term conflict?

No, I'm saying that the @protected feature was created for use by specifications that do require significant coordination to decide what the immutable definitions for certain terms in certain documents ought to be. This coordination may be done over the period of several years, in a standards working group. The @protected feature is to explicitly prohibit different definitions for the same terms in the same places in JSON documents. There is no way for two (or more) different context authors to coordinate to sort out a term conflict here, because using a definition different from what is written in the spec is prohibited. The coordination must happen prior to the spec becoming a standard.

This prohibition exists for a good reason: to enable both rigid and flexible implementations to interoperate.

It is used when there is a spec that expresses, in detail, a data model and JSON format, such that implementers who read the spec can write rigid implementations "in the context of" the data as expressed in the specification. In other words, from this perspective, these specs are no different from any other specification designed around information expressed in JSON (with no capability to transform conforming documents into some other expression).

These rigid implementations treat the URLs in the @context field as simple document type + version identifiers. No JSON-LD library or API calls are needed to work with conforming documents, as conformance requires that these fields match specific values and that the documents have an expected structure.

However, behind these @context values are actual JSON-LD context documents that are processable by more flexible implementations. These flexible implementations are able to use JSON-LD libraries to understand the data (potentially even without the spec, through "follow your nose") or to transform the data into a different expression that their code is expecting. By using the @protected keyword in these contexts, an enforcement process is introduced by which the same interpretation is guaranteed to be used across these different implementation approaches (or a protected term error will be thrown).

Of course, enabling these two approaches at once has trade offs. Nothing is for free. Coordination is required while creating the spec and, as is always required when using a JSON spec, a conforming document must not deviate from what's in the spec or reuse terms (JSON keys) to mean something other than what is in the spec. Simply put: the use of a spec and the @protected feature to increase interoperability across implementations of differing complexity reduces some decentralized extensibility in exchange for allowing less complex (but interoperable) consumers.

This seems unworkable given the open-world assumption.

It's workable, and only slightly more constrained, i.e., you can't "just use whatever term definitions you want" in your documents and expect them to be consumable by simpler implementations that are unable to understand your changes. The most basic and commonly reused term definitions from a spec are immutable.

If it helps, this can be thought of as extending the set of JSON keys that JSON-LD already doesn't allow redefinition of, i.e., all keywords (e.g., @context, @id, @type). I don't think this constraint makes JSON-LD "unworkable given the open-world assumption", as you say. By using the @protected feature, a context author just reduces the set of immutable JSON keys a little further beyond what JSON-LD already restricts in its own spec.

Specs that use this feature require the more complex implementations to express their documents in a more rigid way (really, in a specific context) in order to enable simpler implementations to exist. However, you can, of course, express all the information you want using other terms that the spec doesn't mark as @protected. The more complex implementations can then transform incoming documents into whatever contexts they want to (using whatever terms they want to) for consumption.

It is true that when a spec uses this feature it might become incompatible with another spec that also tries to enable these two types of implementations: you can't have a single document be expressed using two contexts that are in conflict with one another. Note that the Activity Streams work tried to enable simpler consumers too, it just didn't use the @protected feature (IIRC, it wasn't available at the time). A consequence of this is that anyone can change the definition of a term defined by the Activity Streams context (by using the @context field), but the simpler implementations do not detect it. This creates semantic confusion which can lead to a variety of serious problems. Newer specifications can avoid this by using @protected in their contexts to actually surface these errors -- so that no valid implementation can use such a document (as you say, the document becomes "unprocessable").

This means that either the developer will be forced to write their own context document (even if they don't understand JSON-LD), or that some downstream consumer will have to postprocess the unprocessable JSON-LD to replace the context with their own corrected one.

...

If the aim is to ensure that terms don't get redefined, then this feels like a backfire because the actual result is that the entire document becomes unprocessable; instead of not understanding some number of redefined terms and having them appear to be missing ("I can't find schema:name, I only have as:name, but all the other schema: properties are as expected")

Your concerns are certainly heard -- but it's important to remember that one of the constraints is that the simplest implementations do not use a JSON-LD library at all. To enable these implementations, document authors have to work within the constraints in the specification: you can't change certain term definitions in certain places. Simply allowing any definition to be used without throwing any errors won't solve this problem, it will just create semantic confusion. As always, myself (and many others) are all ears for a better solution to this problem (and given the constraints), but allowing semantic confusion to happen isn't an acceptable outcome -- so this is the best solution we've landed on for now.

Oct 20 '24 20:10 dlongley

@trwnh wrote:

b) convince whoever is responsible for controller/v1 to redefine alsoKnownAs with the activitystreams-namespaced @id instead of the security-namespaced one

Hi, that's me ("whoever is responsible for controller/v1") :)

It's a bug, thanks for catching it; that context is fairly new and hasn't been put through its paces yet.

Feel free to raise a PR on controller/v1 to fix the issue, or I will do it when I get around to addressing the issue you raised in that repository.

Oct 20 '24 20:10 msporny

Your concerns are certainly heard -- but it's important to remember that one of the constraints is that the simplest implementations do not use a JSON-LD library at all. To enable these implementations, document authors have to work within the constraints in the specification: you can't change certain term definitions in certain places. Simply allowing any definition to be used without throwing any errors won't solve this problem, it will just create semantic confusion.

This is part of my concern, though: a document producer who does not use JSON-LD, but declares two well-known remote context documents, because the specs tell them to, or because they think that's what they need to do.

What this producer has just done is completely invisible to "plain JSON" consumers (who aren't aware of any term definitions let alone the possibility of redefining one or that this might conflict). But even the most basic of JSON-LD processors now has to deal with the mess that was created by this incompatible context declaration. I'm not entirely convinced of the fail-fast-and-hard approach here; maybe the JSON-LD processing algorithm could use an additional flag that converts these errors to warnings? This would allow the processor to at least have something to process, provided that they are willing to accept the semantic confusion. (Any errors in schema would be caught "further down the chain", so the document may be discarded later if it is unsuitable for further processing.)

Essentially, the use of @protected in any context document needs to come with a heavy disclaimer that this heavily limits compatibility. "Be careful, this can prevent adaptation" feels like it's not making the consequences fully clear. There should probably be language added around using multiple context documents, and how the use of @protected in any one of them can create problems depending on the order you declare those contexts or on whether any of the others likewise declare @protected. It should be clearly called out that "warning, the JSON-LD document may become unprocessable" is even a possibility, so that context publishers can carefully consider this possible consequence before just slapping a @protected in there.

Oct 21 '24 01:10 trwnh

maybe the JSON-LD processing algorithm could use an additional flag that converts these errors to warnings?

This has been discussed before and identified as a really bad idea. What you describe is in the class of errors that can lead to security compromises. If you want to ignore these sorts of security compromises, don't use @protected, but if you don't use @protected, don't expect people to depend on your context in situations where security is important.

When this sort of thing happens (overriding errors of a detected term conflict), it is definitely a problem that must not be ignored. Doing so would be like a static analysis tool for a non-memory range checked language finding out that you're using memory after it has been freed and allowing the practice to continue happening -- it's a recipe for something really bad happening to the code in production.

Oct 21 '24 13:10 msporny

Okay, if you say @protected should be used to help avoid security compromises, then when should one not use @protected? It still feels like the feature is being overused, and most of the cases of conflicting terms I've encountered appear to be primarily in the class of semantic errors where two term definitions differ in @id but could be taken to represent the same concept (owl:equivalentClass or owl:equivalentProperty). Things like as:name vs schema:name, or as:mediaType vs schema:encodingFormat. If there is any difference between the two terms, it is in the spec processing level, and in what those terms imply for the processing of other terms; for example, as:mediaType has implications for as:content or as:href, whereas schema:encodingFormat might have implications for schema:contentUrl or more generally for a schema:CreativeWork.

To be clear, I think this kind of thing (where a certain interpretation is required) somewhat strongly indicates that perhaps application/ld+json is no longer sufficient as a content type for that document, and a dedicated media format with its own processing semantics might be required (like application/activity+json or application/vc). Somewhat unfortunately, it looks like going this route also implicitly locks down extensibility, with one context document being given supremacy over any others. This is probably fine for documents of that content type that only use that context, or might augment it with a few additional term definitions... but an interpretation where a document may wish to conform to multiple types is not possible.

In light of that, perhaps the use of @protected should be advised (or reserved?) only in cases where you are no longer doing (or you expect your consumers to no longer be doing) "generic JSON-LD". Maybe some language along the lines of "if you use @protected, consider defining your own media type separately from application/ld+json, because the use of @protected significantly constrains the semantics and processability of the document" -- this leaves the problem of "conforming to multiple types" unsolved (and that might be a larger problem that JSON-LD itself cannot solve on its own), but at least it sets the expectations correctly.

Oct 21 '24 14:10 trwnh

This was discussed during the json-ld meeting on 13 November 2024.

View the transcript

Issue Discussion

bigbluehat: We're working through the project list.

gkellogg: added issues that are class 1-3.

subtopic w3c/json-ld-syntax#436

<gb> Issue 436 URI in Profile triggers CORS Unsafe Request Header Byte rule (by azaroth42) [spec:w3c] [needs discussion] [tag-needs-resolution]

gkellogg: might just create "tokens" for profile paraemters.

gkellogg: tokens not being namespaced is mitigated by the fact that the media-type is the namespace.

bigbluehat: So, it treats the media-type as the namespace.
… Profile parameters not having a colon is wide-reaching

gkellogg: not sure how we update guidance for using profile parameters.

bigbluehat: This would be a breaking change for web annotations.
… That would mean web annotations needs their own media type.

niklasl: dlehn's reply may mean this isn't as horrible as it seems.
… I think the datasets working group has done something with this.

pchampin: This doesn't seem to be a problem where things can't work, but making them work is tricky, due to pre-flight requests.
… If we expect a server to support profile-based content-negotiation, it doesn't come automatically.
… If you want to support this, you'll also need to support pre-flight requests.

<bigbluehat> q|

pchampin: This is difficult to configure and easily forgotten.

<gb> Issue 436 URI in Profile triggers CORS Unsafe Request Header Byte rule (by azaroth42) [spec:w3c] [needs discussion] [tag-needs-resolution]

bigbluehat: There were some suggestions for defining enumerated values (tokens).

<pchampin> I think it wouldn't hurt to define "short names" for the profiles in addition to the currently defined IRIs

bigbluehat: The key is to not make it a breaking change.
… This would affect the media-type registration.

niklasl: Aren't link headers defined similarly, where there are pre-defined tokens and IRIs may also be used.

bigbluehat: Browsers have made decisions which are affecting what we can do.

<bigbluehat> > When processing the "profile" media type parameter, it is important to note that its value contains one or more URIs and not IRIs. In some cases it might therefore be necessary to convert between IRIs and URIs as specified in section 3 Relationship between IRIs and URIs of [RFC3987].

https://www.w3.org/TR/json-ld11/#iana-considerations

<niklasl> application/ld+json;profile="http://iiif.io/api/presentation/3/context.json"

niklasl: I think it would be good to add tokens. Rob's specific problem are more about the other uses of profiles.
… I wonder if our solution would be considered a solution for the issue; maybe parts of the issue can't be solved in the JSON-LD spec. Might recommend IIIF to use profile negotiation.
… But, using pre-flight does work, so that would be on their end.
… It's more that we put forward the design pattern and it has become more tricky.

bigbluehat: The ramifications of this are not just expand/compact/... Rob's point is for other specifications that used the same pattern.
… No we know to avoid it.

bigbluehat: There's reason to document this in the best-practices document. How this affects other specs would mean that they cannot treat profile as being extensible, and will need a new media type.

gkellogg: we might create a registry to allow other specifications to add their profile parameters without needing a new media-type.

bigbluehat: niklasl shared a document on using the profile parameter for content negotiation.

pchampin: Reaching out the that TAG would be a good idea, as other specs rely on this, and they would be impacted.
… I'd like to see their thoughts and how much we should make the effort to try to change this.
… Regarding the spec, note that this is a working draft which has been inactive for a while. This might not be the strongest argument to take before the TAG. (The dataset exchange WG)
… Part of the reason that spec is stalled is that there are contentious discussions with IETF on where it belongs.

<niklasl> From the dx-prof-conneg draft: During 2018, DXWG members had a longer discussion with the JSON-LD WG at the annual forum TPAC in Lyon, France and it was concluded that the "profile” parameter in the Accept and Content-Type headers should be seen to convey profiles that are specific to the Media Type [such as JSON-LD's expanded .... ]

pchampin: But, is there enough interest in IETF to continue the work?

niklasl: There are aspects of the draft that goes into the profile parameter of the media type is the right way to go.
… The design of IIIF and Activity Streams I appreciate more when not looking at it from an RDF perspective.
… These are more useful at the intersection of JSON and RDF, which makes it easier to create specifications in a distributed way.
… If I believed (from RDF perspective) that format is irrelevant, general content negotiation works well.
… I can see how the TAG might argue from one of these perspectives. Maybe we shouldn't invent media-types on the fly.

<pchampin> https://www.w3.org/TR/vc-data-model-2.0/#media-type-precision

pchampin: Regarding the value of using JSON-LD media-type with parameter vs a new media-type, VC has had to rely on this for a while.
… The current solution is to have a dedicated media-type with additional language to explain the relationship between the two media types.
… We might point other specs to that solution.

<niklasl> +1 to mentioning that "third" point of view (very pertinent IMHO)

bigbluehat: I think we need to move on and come back to this issue.
… It would be great to write some of these things up on the issue so that we have something coherent to bring to the TAG.
… IETF has shifted their approach, and we're stuck in the middle. In the mean time, if we can collect thoughts in the issue.
… I don't think we know enough to lay out the preferred solution.
… If we go the short-name route, we run the risk of turning into a registry.

<bigbluehat> w3c/json-ld-syntax#443

<gb> Issue 443 `@protected` creates unresolvable conflicts when the same term is defined in two contexts top-level (by trwnh) [spec:editorial] [wr:commenter-agreed-partial] [class-2]

Nov 13 '24 17:11 pchampin

This was discussed during the json-ld meeting on 13 November 2024.

View the transcript

w3c/json-ld-syntax#443

bigbluehat: This dove-tails with the profile-parameter conversation for other communities
… If a media type expects a context to exist, they would inject one if not provided.
… We could make other discussion issues from comments in this issue.

niklasl: IIRC, Activity Streams says you should put their context last because of this issue.
… If you use short names that have meaning, you must lock them down.

dlehn: I need to re-review the issue.
… In the case of the controller, it would be to change the activity streams URL, but that's kind of strange. People expect terms to be gathered in one place.

<niklasl> Maybe what is asked for is how to use this design pattern to have partial extensibility, extensions which are always subordinate to the "hardcoded" context (that may evolve)?

dlehn: This would conflict with other things where JWT is also used.

pchampin: The comment at the end is interesting as it resonates with TPAC discussions.
… There are two types of JSON-LD, one which is more about the RDF semantics, the other is about general representation of knowledge.
… I sympathize that we should make this more clear, but don't think it's a bug in the spec.

bigbluehat: There's a tension between generic JSON-LD which is endlessly pluggable, which confuses people.
… In this view, JSON-LD isn't the end product, but adding in @protected you constrain it into a use case, as you are using very specific terminology and limiting the extension points.
… At TPAC there was a discussion about other things, such as schema.org, or are we going to specific content formats with self-defined semantics.
… Maybe this is not a syntax change, but a best practices note. If you're in ld+json land you can do what you want, but if you're in something that provides more constraints, you may need different solutions.

<niklasl> +1 for best practice

<anatoly-scherbakov> +1

<gkellogg> +1

dlehn: It seems to be a bit more than best-practices as you need to tell people how to get around the rules.

dlehn: It's nice when things live together.

bigbluehat: In the future, maybe there would be a way to link from the spec to BP.

<bigbluehat> PROPOSAL: Address the concerns around when to use `@protected` (which were raised in https://github.com/w3c/json-ld-syntax/issues/443) through new content in the JSON-LD Best Practices document.

<gb> Issue 443 `@protected` creates unresolvable conflicts when the same term is defined in two contexts top-level (by trwnh) [spec:editorial] [wr:commenter-agreed-partial] [class-2]

<bigbluehat> +1

<niklasl> +1

<pchampin> +1

<gkellogg> +1

<anatoly-scherbakov> +1

<TallTed> +1

<dlehn> +1

dlehn: Is it more "when" or "how" to use @protected?

RESOLUTION: Address the concerns around when to use `@protected` (which were raised in https://github.com/w3c/json-ld-syntax/issues/443) through new content in the JSON-LD Best Practices document.

bigbluehat: We can make it as "best practice" and notify the commenter.

<niklasl> ... and *why* to...

bigbluehat: @protected needs more content.

<dlehn> "... when, how, and why to use ..."

RESOLUTION: Address the concerns around when to use `@protected` (which were raised in https://github.com/w3c/json-ld-syntax/issues/443) through new content in the JSON-LD Best Practices document.

Nov 13 '24 17:11 pchampin

This was discussed during the #json-ld meeting on 26 February 2025.

View the transcript

w3c/json-ld-syntax#443

<gb> Issue 443 `@protected` creates unresolvable conflicts when the same term is defined in two contexts top-level (by trwnh) [spec:editorial] [wr:commenter-agreed-partial] [class-2]

pchampin to comment and close this issue.

pchampin: I'll comment, but I don't want to be too dismissive.
… There may be a solution to their problem; I'll investigate. I don't think there's a need for a spec change.

niklasl: Sounds like it should go in Best Practices.
… Generally, it might be clearer to explain that if you use someone else's context you should create your own.

<dlehn> a related issue with someone bumping into protected redefinitions: digitalbazaar/jsonld.js#563 (and also confusion related to ordering based on type names)

<gb> Issue 563 Lint complains of redefinition when re-using term with context (by absoludity)

Feb 26 '25 17:02 w3cbot

Dear @trwnh , as you can see above, the WG considers that this issue should be addressed in the Best Practices document, but does not call for a change in the spec.

In light of that, perhaps the use of @protected should be advised (or reserved?) only in cases where you are no longer doing (or you expect your consumers to no longer be doing) "generic JSON-LD". Maybe some language along the lines of "if you use @protected, consider defining your own media type separately from application/ld+json, because the use of @protected significantly constrains the semantics and processability of the document" -- this leaves the problem of "conforming to multiple types" unsolved (and that might be a larger problem that JSON-LD itself cannot solve on its own), but at least it sets the expectations correctly.

I'm very much aligned with your view. To quote @ericprud in a recent conversation : "JSON-LD can be used as a way to interpret JSON-based formats as RDF, and it can be used as a generic syntax for enconding RDF in JSON, but it can't do both at the same time." @protected is clearly meant for the first category of use-cases (so that the interpretation of RDF users does not diverge from that of JSON users). And yes, for the 1st category of use-cases, specific media-types are probably the right way to go.

As you can see, the Best Practices document has not received much love recently, and the WG is not in excess of active particpants right now... It would be awesome if you could make a PR to this document where you would basically include the ideas you developed above.

Feb 27 '25 20:02 pchampin

This was discussed during the #json-ld meeting on 09 April 2025.

View the transcript

w3c/json-ld-syntax#443

<gb> Issue 443 `@protected` creates unresolvable conflicts when the same term is defined in two contexts top-level (by trwnh) [spec:editorial] [propose closing] [wr:commenter-agreed-partial] [class-2]

gkellogg: this came in through some confusion and discussion
… and any remaining concerns have been moved to another issue
… anyone object to closing this one?

We'll close the issue pending further input.

Apr 09 '25 16:04 w3cbot

We'll close the issue pending further input.

Further input: I don't have much to really add on, just want to distill what has been said so far and make sure that any other spun-off issues/PRs/etc are linked from here.

This information should probably make it into the Best Practices document, and possibly a brief non-normative callout note right next to the definition of @protected...

"JSON-LD can be used as a way to interpret JSON-based formats as RDF, and it can be used as a generic syntax for encoding RDF in JSON, but it can't do both at the same time." @protected is clearly meant for the first category of use-cases (so that the interpretation of RDF users does not diverge from that of JSON users).

for the 1st category of use-cases, specific media-types are probably the right way to go.

Somewhat unfortunately, it looks like going this route also implicitly locks down extensibility, with one context document being given supremacy over any others.

Maybe some language along the lines of "if you use @protected, consider defining your own media type separately from application/ld+json, because the use of @protected significantly constrains the semantics and processability of the document" -- this leaves the problem of "conforming to multiple types" unsolved (and that might be a larger problem that JSON-LD itself cannot solve on its own), but at least it sets the expectations correctly.

Additionally, I saw some language regarding "media type precision" in Verifiable Credentials work: https://www.w3.org/TR/vc-data-model-2.0/#media-type-precision which might also make sense to integrate into JSON-LD Best Practices language.

On the following language from https://w3c.github.io/json-ld-syntax/#protected-term-definitions as well: would a PR for this change make sense?

Note: By preventing terms from being overridden, protection also prevents any adaptation of a term (e.g., defining a more precise datatype, restricting the term's use to lists, etc.). This kind of adaptation is frequent with some general purpose contexts, for which protection would therefore hinder their usability. As a consequence, context publishers should use this feature with care. In particular, using a protected context in a JSON-LD document significantly constrains the semantics and processing of that document beyond the constraints of generic JSON-LD. Specifications that make use of protected contexts should therefore consider defining a dedicated media type to which the protected context applies.

Apr 10 '25 07:04 trwnh

In particular, using a protected context in a JSON-LD document significantly constrains the semantics and processing of that document beyond the constraints of generic JSON-LD.

I don't think this statement is true. What is true is that you can't use a context with protection rules unless you comply with them. It doesn't mean that a document carrying such a context can't be processed by "generic JSON-LD consumers". Which is really the whole point: such a context allows for processing by both "specific/specialized JSON-LD consumers" and "generic JSON-LD consumers". A context without such protections can't be used in that way.

Specifications that make use of protected contexts should therefore consider defining a dedicated media type to which the protected context applies.

A dedicated media type can be of use when communicating with the specific/specialized JSON-LD consumers, but not with the generic ones. So there is more complexity here than I think is captured. I don't have an alternative suggestion. Another minor note is that the word "should" is RFC language, so that would need to be replaced as well.

Apr 10 '25 18:04 dlongley

What is true is that you can't use a context with protection rules unless you comply with them

So then I guess the phrasing needs to be centered around the use of multiple contexts, where at least one is protected and at least one conflicts with a protected term, which becomes more likely as more terms are protected/reserved.

Maybe something like:

Note: By preventing terms from being overridden, protection also prevents any adaptation of a term (e.g., defining a more precise datatype, restricting the term's use to lists, etc.). This kind of adaptation is frequent with some general purpose contexts, for which protection would therefore hinder their usability. In particular, using multiple independent contexts can more easily lead to conflicts depending on the order in which protected contexts are declared and on which terms are protected. Such conflicts might be unresolvable, and in those cases, it becomes impossible to use those contexts together in the same document. Some specifications or media types require or assume certain normative protected contexts; in these cases, conformance to multiple normative contexts depends on resolving any conflicts. As a consequence, context publishers should use this feature with care.

In practical terms, there is also the slightly modified problem of dealing with multiple resources being included in the same graph:

{
  "@context": {
    "conflictingTerm": {
      "@id": "https://foo.example",
      "@protected": true
    }
  },
  "conflictingTerm": false,
  "https://generic-property.example": {
    "@context": {
      "conflictingTerm": "https://bar.example"
    },
    "conflictingTerm": "yes"
  }
}

In the above example, a property-scoped context for https://object-property.example will not work because it is unknown ahead-of-time. I have tried using @propagate: false to work around this:

{
  "@context": {
    "@propagate": false,
    "conflictingTerm": {
      "@id": "https://foo.example",
      "@protected": true
    }
  },
  "conflictingTerm": false,
  "https://generic-property.example": {
    "@context": {
      "conflictingTerm": "https://bar.example"
    },
    "conflictingTerm": "yes"
  }
}

This seems to work, but it also means that I have to repeat protected definitions nested within properties that are expected (to get around the lack of propagation at the top-level):

{
  "@context": {
    "@propagate": false,
    "conflictingTerm": {
      "@id": "https://foo.example",
      "@protected": true
    },
    "specificProperty": {
      "@id": "https://specific-property.example",
      "@type": "@id",
      "@protected": true,
      "@context": {
        "conflictingTerm": {
          "@id": "https://foo.example",
          "@protected": true
        }
      }
    }
  },
  "conflictingTerm": false,
  "https://generic-property.example": {
    "@context": {
      "conflictingTerm": "https://bar.example"
    },
    "conflictingTerm": "not anymore"
  },
  "specificProperty": {
    "conflictingTerm": false
  }
}

This again triggers the conflict, but expected this time:

{
  "@context": {
    "@propagate": false,
    "conflictingTerm": {
      "@id": "https://foo.example",
      "@protected": true
    },
    "specificProperty": {
      "@id": "https://specific-property.example",
      "@type": "@id",
      "@protected": true,
      "@context": {
        "conflictingTerm": {
          "@id": "https://foo.example",
          "@protected": true
        }
      }
    }
  },
  "conflictingTerm": false,
  "https://generic-property.example": {
    "@context": {
      "conflictingTerm": "https://bar.example"
    },
    "conflictingTerm": "not anymore"
  },
  "specificProperty": {
    "@context": {
      "conflictingTerm": "https://bar.example"
    },
    "conflictingTerm": "it is again"
  }
}

So I guess one technique that context publishers can use is to disable context propagation, then selectively redefine all of their own protected terms inside a property-scoped context wherever they expect nesting to remain "within the context"?

So maybe something like:

Note: By preventing terms from being overridden, protection also prevents any adaptation of a term (e.g., defining a more precise datatype, restricting the term's use to lists, etc.). This kind of adaptation is frequent with some general purpose contexts, for which protection would therefore hinder their usability. In particular, using multiple independent contexts can more easily lead to conflicts depending on the order in which protected contexts are declared and on which terms are protected. Such conflicts might be unresolvable, and in those cases, it becomes impossible to use those contexts together in the same document. Some specifications or media types require or assume certain normative protected contexts; in these cases, conformance to multiple normative contexts depends on resolving any conflicts. As a consequence, context publishers should use this feature with care. One technique for minimizing conflicts is to disable [context propagation] at the top level, then redefine protected terms within a [property-scoped context].

I'm not sure what to do with that last sentence, because it feels like it could be expanded into more fleshed-out guidance, possibly as an additional example...

Apr 11 '25 04:04 trwnh

Simpler advice might be to not mix unexpected inline contexts with protected contexts. You don't have to do anything with @propagate if you use another non-inline context to define https://generic-property.example (where you can nullify the context, etc.) or if you set aside a specific property in your document that nullifies the context like is done in the VC 2.0 context for the verifiableCredential property of a VerifiablePresentation.

One goal of @protected is to enable specific/specialized consumers to be able to check the top-level context for well-known values without having to do any additional generalized processing; so it is not intended to be compatible with unexpected, later appearing inline contexts. "Commitments" are essentially made not to do things those consumers wouldn't expect / couldn't handle without taking on more generalized consumption burdens.

Apr 11 '25 16:04 dlongley

not mix unexpected inline contexts with protected contexts

This works for a single document but not so well when you start mixing multiple documents (as in a Linked Data setting):

{
  "@id": "https://thing1.example/",
  "@context": "https://well-known-protected-context.example",
  "termFromWellKnownProtectedContext": true
  "https://generic-property.example": {"@id": "https://thing2.example/"}
}

{
  "@id": "https://thing2.example/",
  "@context": "https://unprotected-context.example",
  "termFromWellKnownProtectedContext": "this only conflicts when embedded"
}

A common thing that naive (possibly LD-unaware) processors will do is hydrate a document to include expanded references where possible, as a form of convenience:

{
  "@id": "https://thing1.example/",
  "@context": "https://well-known-protected-context.example",
  "termFromWellKnownProtectedContext": true,
  "https://generic-property.example": {
    "@id": "https://thing2.example/",
    "@context": "https://unprotected-context.example",
    "termFromWellKnownProtectedContext": "this only conflicts when embedded"
  }
}

Here, the previously non-conflicting term termFromWellKnownProtectedContext suddenly becomes conflicting, because the reference to https://thing2.example/ was expanded to embed the latter document into the former document.

The use of @propagate, I've found in the playground, prevented the conflict from occurring. Are you saying that the "simpler advice" is for these LD-unaware processors to mutate the documents such that the inline context becomes top-level? How would that even work for IRI property keys?

{
  "@id": "https://thing1.example/",
  "@context": {
    "termFromWellKnownProtectedContext": {
      "@id": "https://foo.example/",
      "@protected": true
    }
  },
  "termFromWellKnownProtectedContext": true,
  "https://generic-property.example": {
    "@id": "https://thing2.example/",
    "@context": {"termFromWellKnownProtectedContext": {"@id": "https://bar.example/"}},
    "termFromWellKnownProtectedContext": "this only conflicts when embedded"
  }
}

I don't see any solution for this except to add @propagate: false to the context that uses protected terms:

{
  "@id": "https://thing1.example/",
  "@context": [
    {
      "termFromWellKnownProtectedContext": {
        "@id": "https://foo.example/",
        "@protected": true
      },
      "@propagate": false
    }
  ],
  "termFromWellKnownProtectedContext": true,
  "https://generic-property.example": {
    "@id": "https://thing2.example/",
    "@context": {"termFromWellKnownProtectedContext": {"@id": "https://bar.example/"}},
    "termFromWellKnownProtectedContext": "this only conflicts when embedded"
  }
}

To be clear, there is a boundary between https://thing1.example/ and https://thing2.example/, and the boundary is being crossed when the HTTPS URI is dereferenced as the property https://generic-property.example is traversed. Propagating a protected context across that boundary seems to be the cause of the issue.

I recognize that there may be an X-Y error in approach elsewhere -- perhaps the mixing of documents like this is inherently problematic? But I'm not sure what the correct alternative approach would be. It seems particularly common among plain JSON developers to opt for nesting like this, but nesting like this turns an otherwise perfectly fine and processable JSON-LD document into an unprocessable redefinition error.

In light of this, I would think that it's appropriate to ask publishers of protected contexts to consider where their data formats might propagate via nesting and where their data formats shouldn't propagate. In effect, for an guarantees that a protected context provides, these guarantees are not universal across the entire linked data graph; they are only guaranteed as long as you don't cross any boundaries. It's only certain property paths to which the protected context should apply; if the generic consumer traverses an externally defined or extension property path, then protected terms should no longer apply.

If I'm not wrong in my reasoning somewhere, I can proceed to file a PR for the insertions proposed at the end of my previous comment. That PR should close this issue.

Apr 13 '25 07:04 trwnh

As discussed in today's meeting, we're closing this issue without prejudice.

May 07 '25 16:05 gkellogg

This was discussed during the json-ld meeting on 07 May 2025.

View the transcript

w3c/json-ld-syntax#443

<gb> Issue 443 `@protected` creates unresolvable conflicts when conforming to multiple normative contexts (by trwnh) [spec:editorial] [propose closing] [wr:commenter-agreed-partial] [class-2]

gkellogg: I suggested we close this issue in February
… there's a resolution from last November on it

We'll close the issue.

gkellogg: so, if there are no objections, I think we can close this one and move on

<niklasl> +1

<bigbluehat> +1

May 08 '25 11:05 pchampin