12-step-meeting-list icon indicating copy to clipboard operation
12-step-meeting-list copied to clipboard

Language update (add and standardize languages)

Open joshreisner opened this issue 2 years ago • 18 comments

we define language types in variables.php on a per-program basis

we want to:

  1. extract all the languages into a single array and apply them to all programs
  2. add the additional meeting types being added (currently listed under Proposed New Types) on the JSON spec
  3. confirm that these changes only add and do not change any existing types for any programs

joshreisner avatar Nov 04 '23 18:11 joshreisner

here are the new language types that were recently added:

Code Language
AM Amharic
DA Danish
DE German
EL Greek
FA Persian
HI Hindi
HR Croatian
HU Hungarian
LT Lithuanian
ML Malayalam
SK Slovak
SV Swedish
TH Thai
TL Tagalog
UK Ukrainian

joshreisner avatar Nov 09 '23 15:11 joshreisner

FYI Ukrainian is 'UA'

gkovats avatar Nov 19 '23 14:11 gkovats

i think UA may the locale code for Ukraine (the country), but UK is for Ukrainian (the language)?

places i see UK: https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes https://www.w3schools.com/tags/ref_language_codes.asp https://www.loc.gov/standards/iso639-2/php/code_list.php http://www.lingoes.net/en/translator/langcode.htm https://www.science.co.il/language/Codes.php https://developers.google.com/admin-sdk/directory/v1/languages

places i see UA: https://www.iso.org/obp/ui/#iso:code:3166:UA

joshreisner avatar Nov 19 '23 16:11 joshreisner

Sorry, meant according to variables.php https://github.com/code4recovery/12-step-meeting-list/blob/main/includes/variables.php#L1353

New to the codebase, just saw it'd be using a different code there. May not be relevant to this request.

gkovats avatar Nov 19 '23 19:11 gkovats

oh good callout, thanks! i guess we should make an exception for Survivors of Incest Anonymous

joshreisner avatar Nov 19 '23 20:11 joshreisner

Thoughts on standardizing languages to ISO 639 Set 1 codes? I think this would also call for changes in spec, eg "Spanish" S -> es

gobborg avatar Jun 24 '24 16:06 gobborg

the complexity of implementing in AA across all its various consumers is not worth the effort, IMO

joshreisner avatar Jun 24 '24 16:06 joshreisner

"3. confirm that these changes only add and do not change any existing types for any programs" -> I do not know that this is possible without granting several exceptions, which can get messy unless existing programs are allowed a legacy rule.

As it is, there are 3 ways in variables.php that Spanish is defined, ES, S, and SP, but I think Spanish is the only language with multiple definitions. Different programs also abbreviate Speaker differently, including S, SP, and SPK. If not imposing ISO 639 set 1, I see as options:

  1. Do not touch Spanish at all and continue to define Spanish on a per-program basis.
  2. Force Spanish 'S' and Speaker 'SP', as per spec, and apply to all programs. Spec's existing language codes are allowed to continue, and newly added languages are ISO 639-compliant.

Would changing SIA's Ukrainian UA -> UK impact them (unless maybe they try to load from a backup csv)? They're the only ones with Ukrainian, and UK is otherwise not a type in variables.php.

Workshopping, I imagine an option where in "Meeting Information"

Day Time Types Notes ...

A field gets added for "Language", between Types and Notes, and selecting a Language is optional.

Impact:

  • decouple Language from Types
  • existing programs with a selected language Type would lose that type and would need to reselect in the new field upon TSML version update
  • less "visual clutter" for programs that offer many Types
  • Language can be collapsed with a "View all" to make it easily ignored by programs that don't support languages other than English

I welcome any thoughts, feedback, etc.

Edit: diction

gobborg avatar Jun 24 '24 16:06 gobborg

i'm down to start a new set of checkboxes between types and notes, i do think that could help declutter the UI. but let's not make changes to where or how the data is stored or transmitted.

so hopefully sites would lose any data.

great callouts on UK and S / ES / SP -- i think the safest option would be to preserve those inconsistencies while defaulting the rest of the languages, eg:

// preserve legacy inconsistencies
unset($tsml_programs['SIA']['types']['UK'])
$tsml_programs['SIA']['types']['UA'] = 'Ukranian';

joshreisner avatar Jun 24 '24 19:06 joshreisner

Can you be more specific about:

  1. Where to store the data? Are Languages to be part of the Types (sql table?) but displayed as their own block?
  2. What to do re Spanish/Speaker?

Could you also tell me what the impact would be on changing abbreviations, eg Ukrainian, if not allowing a legacy rule?

Edit: diction

gobborg avatar Jun 24 '24 20:06 gobborg

language information is currently stored in types postmeta, we should not make changes to that

if we were to change the codes associated with the types, people could lose data, so let's try to avoid that

therefore let's keep the Spanish/Speaker types as they are with their various inconsistencies

joshreisner avatar Jun 24 '24 22:06 joshreisner

Ok, hardcoding legacy exceptions for the various groups is fine, but what to do about programs that don't have Spanish as a type? Should Spanish be coded S per spec or ES per ISO 639?

Edit: diction

gobborg avatar Jun 25 '24 14:06 gobborg

i'd say let's make the default whichever is currently the most popular, because it will mean the fewest exceptions

if i could go back in time and do things differently i'd have made this follow the international spec, but these got inherited from a particular aa intergroup and those are the codes they used

joshreisner avatar Jun 25 '24 16:06 joshreisner

SGTM. Ready to see this on a branch? image

gobborg avatar Jun 25 '24 17:06 gobborg

looks good!

  • for consistency with the meeting guide app, ASL should stay with Types
  • let's alphabetize em

joshreisner avatar Jun 25 '24 18:06 joshreisner

Languages are alphabetized by code. image

gobborg avatar Jun 25 '24 18:06 gobborg

let's sort em post-translation by value so the user doesn't have to hunt down the list for "Dutch" for example

joshreisner avatar Jun 25 '24 18:06 joshreisner

https://github.com/code4recovery/12-step-meeting-list/commit/c3b100f45f164cd3c9f6d5a9fcd9c6cde6108ba3

gobborg avatar Jun 25 '24 19:06 gobborg