json-ld-syntax icon indicating copy to clipboard operation
json-ld-syntax copied to clipboard

Handling JSON-LD with duplicate names

Open phochste opened this issue 9 months ago • 6 comments

I was recently reading this blog post: https://alexwlchan.net/2025/duplicate-names-in-json/ which states that the JSON syntax does not require that name strings should be unique. However, these semantics may be considered in further specifications about the use of JSON in data exchanges.

In the JSON-LD specs as far as I know little is said about the handing of duplicate names:

  • In https://www.w3.org/TR/json-ld/ 4.1 I see a mention of duplicate context terms

Can the editor clarify how duplicated names should be handled?

{
  "@context": "http://schema.org/",
  "@type": "Person",
  "name": "Jane Doe",
  "jobTitle": "Professor",
  "jobTitle": "Dean",
  "url": "http://www.janedoe.com"
}

phochste avatar May 06 '25 04:05 phochste

I'm pretty sure issues like that are considered to be out of scope. JSON-LD isn't defining how JSON is parsed. I suspect mostly every JSON parser will pick one value, or in rare cases throw an error by default. For systems where it's a concern, I'd suggest using a parser that fails on such constructs.

Was there an expectation JSON-LD in particular would behave different than all other JSON based systems?

davidlehn avatar May 06 '25 18:05 davidlehn

I don't think it is a pure parsing problem but an interpretation problem. As I understand, the JSON spec does not require JSON processors to handle non-unique names in a particular way. Of course, there exist a defacto way ..based on the majority of processors.

Without some clarification this JSON-LD:

{
  "@context": "http://schema.org/",
  "@type": "Person",
  "name": "Jane Doe",
  "jobTitle": "Professor",
  "jobTitle": "Dean",
  "url": "http://www.janedoe.com"
}

could be potentially parsed by different JSON processors and lead to different data models:

_:x a schema:Person ;
    schema:name "Jane Doe";
    schema:jobTitle: "Professor" , "Dean" ;
    schema:url <http://www.janedoe.com/>

or

_:x a schema:Person ;
    schema:name "Jane Doe";
    schema:jobTitle: "Professor" ;
    schema:url <http://www.janedoe.com/>

or

_:x a schema:Person ;
    schema:name "Jane Doe";
    schema:jobTitle: "Dean" ;
    schema:url <http://www.janedoe.com/>

or

(error no data model)

All, depending on the parsing method. Is that intentionally the case for the JSON-LD spec?

My expectation was not that all four interpretations are valid ways to process JSON-LD.

I write this in context of ODRL processig. There I would like only one way to interpret the JSON-LD regardless of underlying JSON processor.

phochste avatar May 07 '25 05:05 phochste

I read in https://www.w3.org/TR/json-ld11-api/#terminology:

Terms imported from ECMAScript Language Specification [ECMASCRIPT], The JavaScript Object Notation (JSON) Data Interchange Format [RFC8259], Infra Standard [INFRA], and Web IDL [WEBIDL]

And a bit further in the definition of JSON object:

"In JSON-LD the names in an object must be unique."

However the JSON-LD definition references RFC8258 where object names SHOULD be unique (non-unique is technically valid but discouraged).

Is this "must" in the spec normative? And my previous example MUST be a parsing error?

phochste avatar May 07 '25 15:05 phochste

"In JSON-LD the names in an object must be unique."

That's problematic. RDF does not place such restrictions — any given subject could have multiple objects for the same predicate, which are not considered to conflict but rather to combine.

How are these to be handled when serializing RDF as JSON-LD?

TallTed avatar May 07 '25 17:05 TallTed

@TallTed,

"In JSON-LD the names in an object must be unique."

That's problematic. RDF does not place such restrictions — any given subject could have multiple objects for the same predicate, which are not considered to conflict but rather to combine.

How are these to be handled when serializing RDF as JSON-LD?

In JSON-LD, multiple objects for the same predicate are expressed as elements in an array (and the array is the value of a single JSON name (aka "JSON key").

I presume that JSON that contains names that are not unique really only continues to be supported for historical or backwards compatibility purposes. It is not interoperable JSON, see:

https://datatracker.ietf.org/doc/html/rfc8259#section-4

In other words, if you use that kind of JSON, you get no guarantee of inteoperability -- and you may do as you please in your own closed world application.

While I think the current JSON-LD algorithms all operate on an abstract expression of JSON (i.e., the syntax/concrete parsing itself is external), I have wondered in the past if it would be better for JSON-LD to reference "I-JSON" as its basis so that these sorts of corner case questions can be resolved by just pointing to that spec. That spec handles non-interoperable JSON issues such as the one in this issue (and a number of others) by defining a more strict profile:

https://datatracker.ietf.org/doc/html/rfc7493#section-2.3

In practice, I expect very few implementers to use anything other than the built-in JSON parsers in whatever platform they are using -- which is therefore the only sensible basis for wide-scale interoperability.

dlongley avatar May 07 '25 20:05 dlongley

While I think the current JSON-LD algorithms all operate on an abstract expression of JSON (i.e., the syntax/concrete parsing itself is external), I have wondered in the past if it would be better for JSON-LD to reference "I-JSON" as its basis so that these sorts of corner case questions can be resolved by just pointing to that spec.

I think an informative reference to I-JSON would be useful, and as you say, most implementations use a native parser for JSON that generates a INFRA ordered-map for JSON objects. INFRA disallows duplicates keys. By the time the algorithms see the data, there will only be a single entry, so the de-duplication is outside of the realm of the JSON-LD spec, itself.

As an erratum, we can add such informative references and reinforce the fact the result of parsing a JSON-LD object with duplicate keys is undefined.

gkellogg avatar Jun 02 '25 17:06 gkellogg