activitystreams
activitystreams copied to clipboard
Spec does not clarify non-functional natural language values when mapped
Please Indicate One:
- [ ] Editorial
- [ ] Question
- [X] Feedback
- [X] Blocking Issue
- [ ] Non-Blocking Issue
Please Describe the Issue:
In https://www.w3.org/TR/activitystreams-core/#naturalLanguageValues the language mapped forms are examplified:
Accordingly, in the JSON serialization, the terms " name", "summary", and "content" represent the JSON string forms; and the terms " nameMap", "summaryMap", and " contentMap" for represent the object forms.
An example provided is Example 22:
{
"@context": "https://www.w3.org/ns/activitystreams",
"type": "Object",
"nameMap": {
"en": "This is the title",
"fr": "C'est le titre",
"es": "Este es el título"
}
}
However, according to https://www.w3.org/TR/activitystreams-vocabulary/#properties none of the above properties are marked 'functional': name
, summary
, and content
. Thus, having multiple values for these properties is valid.
Therefore, the following message is within spec:
{
"@context": "https://www.w3.org/ns/activitystreams",
"type": "Object",
"name": [ "This is the title", "This is another title" ]
}
However, the spec does not describe how this should be handled in map form, if at all. Two options that would handle it include:
{
"@context": "https://www.w3.org/ns/activitystreams",
"type": "Object",
"nameMap": {
"en": [ "This is the title", "This is another title" ],
"fr": [ "C'est le titre", "C'est un autre titre" ],
"es": [ "Este es el título", "Este es otro título" ]
}
}
and
{
"@context": "https://www.w3.org/ns/activitystreams",
"type": "Object",
"nameMap": [
{
"en": "This is the title",
"fr": "C'est le titre",
"es": "Este es el título"
},
{
"en": "This is another title",
"fr": "C'est un autre titre",
"es": "Este es otro título"
}
]
}
And another implementation could ignore these altogether as being "unhandled", and all three could be able to claim to follow the spec due to the lack of guidance.
[ edit, became clear by 437 comment ] I see, referred to this document https://www.w3.org/TR/activitystreams-vocabulary/#dfn-content
-
content
andcontentMap
and the others should maybe form two different table rows (I'd prefer one row per property) -
and in the document https://www.w3.org/TR/activitystreams-vocabulary/ all findings of "multiple language-tagged values" could be linked with Section 4.7 of the ActivityStreams Core
When I was initially reading the vocabulary document I was also unaware that e.g.
content
and contentMap
are mutually exclusive and that there is a special "und" property for unavailable languages.
@cjslep please also note for content
/ contentMap
- it gets worse :
How should mediaType
behave with multiple content
?
mediaType
is marked functional
!
https://www.w3.org/TR/activitystreams-vocabulary/#dfn-mediatype
This does not even allow me to mix e.g. html and markdown content …
@sebilasse
How should mediaType behave with multiple content ?
Is this what you mean?
{
"type": "Note",
"mediaType": "text/plain",
"content": [
"<!doctype html>some html",
"{}",
]
}
How do we interpret the multiple values of content
?
I do believe this is a good bug. I think the quickest thing we could do is at least to add to that description of 'mediaType'.
If `content` or `contentType` have multiple values, then the meaning of a single `mediaType` value is undefined.
Separately...
- I think the best solution to this is to add a new Type to the range of the
content
property. Allow values like this:
{
"type": "StringContent",
"mediaType": "application/json",
"string": "{}"
}
or, honestly, just allow Links in the range of Content, so allow
{
"type": "Link",
"href": "data:application/json;charset=utf-8;base64,e30=",
}
And then deprecate 'mediaType' on Objects.
Could use editorial feedback @cwebber
@gobengo This is exactly what I meant.
I would go for the "best solution" 😁 If multiple content items are provided, each one should have it's own content encoding and media type.
See e.g. how e.g. JSON Schema spec. deals with it http://json-schema.org/latest/json-schema-validation.html#rfc.section.8.3
The default could be
{
"content": "foo",
"encoding": "8bit",
"mediaType": "text/html"
}
but it could also be an image
{
"content": "bar",
"encoding": "base64",
"mediaType": "image/png"
}
where contentEncoding can be RFC 2045
"7bit" | "8bit" | "binary" | "quoted-printable" | "base64" | ietf-token | x-token
@cjslep fyi: Made JSON Schemas https://github.com/redaktor/ActivityPubSchema
I think we want the same thing. I think there's work to be done to clarify what the 'range' of 'content' should be. Object
is probably fine, but might be weirdly broad. Perhaps an extension should define a Content
type and related StringContent
. Or take a look at oa:TextualBody
The Vocabulary document does not specify that these properties are "functional", but it does refer to the properties in the singular as part of the definitions. For example,
-
content
: "The content or textual representation of the Object" -
name
: "A simple, human-readable, plain-text name for the object." -
summary
: "A natural language summarization of the object encoded as HTML."
None of the examples have multiple values for these properties, and there is no guidance on how consumers should handle multiple values here.
I think the resolution for this problem is to include the Functional flag for these properties in the ERRATA, and to document a best practice for dealing with multiple values if found.