documentation icon indicating copy to clipboard operation
documentation copied to clipboard

Update language attribute restrictions in API

Open yheuhtozr opened this issue 2 years ago • 4 comments

As far as I read the current implementation, the software technically processes hyphened language tags and does legally emit ISO 639-3 as well as ISO 639-1 codes. That means even though Mastodon itself is not ready to support the most of advanced language tags, it has no problem inputting/outputting those codes in API. Thus we can:

  • remove some outdated descriptions limiting to "ISO 639-1 two-letter code", which no more applies
  • accept well-formed BCP 47 language tags, with notes that Mastodon probably ignore additional information

https://github.com/mastodon/mastodon/issues/19302 is a corroboration that a language tag doesn't break the system.

The rationale of this change is discussed in https://github.com/mastodon/mastodon/issues/23541.

Closes https://github.com/mastodon/mastodon/issues/23541.

  • Note: "language subtag" in BCP 47 ≈ ISO 639-1 ∪ ISO 639-3

yheuhtozr avatar Mar 29 '23 09:03 yheuhtozr

@yheuhtozr is attempting to deploy a commit to the Mastodon Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar Mar 29 '23 09:03 vercel[bot]

+1 to the general idea that BCP47 language tags are the way to go.

-1 to this specific change, as I think it's a backwards incompatible change (Mastodon servers could now emit a string field that is > 2 characters long, where previously they were documented as only emitting a 2 character string).

A backwards compatible change would be to accept / emit a new language_code field, where the value is an object with type and code fields. I.e.,

"language_code": {
    "type": "...",
    "code": "..." 
},

Valid initial values for type would be iso639-1, iso639-2 (3 letter codes), and bcp47.

So:

"language_code": {
    "type": "iso639-1",
    "code": "en"
},

is equivalent to the current:

"language": "en",

If this field exists then the contents of the language field would be ignored.


Edit to note: The above is an example, not something I've spent a serious amount of design thought on.

nikclayton avatar May 31 '23 13:05 nikclayton

A backwards compatible change would be to accept / emit a new language_code field, where the value is an object with type and code fields.

@nikclayton Hi, that will be totally fine with me, too. Just for sure (since I'm a stranger), I think it entails additional logic in the Mastodon code, but do you think that will be more feasible?

yheuhtozr avatar Jul 04 '23 07:07 yheuhtozr

This pull request has merge conflicts that must be resolved before it can be merged.

github-actions[bot] avatar Aug 21 '24 23:08 github-actions[bot]