Nager.Country icon indicating copy to clipboard operation
Nager.Country copied to clipboard

Adding three-letter ISO code for languages

Open CasperWSchmidt opened this issue 3 years ago • 10 comments

Hi there We currently do some validation of language codes in our system. The validation is done based on a regex ^[a-z]{3}$ but I would like to tighten the validation to actual ISO codes. From what I can see in this repo, only the two-letter ISO codes are part of the translations. Is it feasible to add the three-letter ISO codes as well?

Also language info is not part of the main package Nager.Country, but part of Nager.Country.Translation, but isn't it relevant to have the spoken language(s) of a country in the main package, like having currencies? Then translations can stay in a separate package to keep the size down (as noted in #2)

CasperWSchmidt avatar Oct 04 '22 08:10 CasperWSchmidt

I think we only need a dictionary with the mapping https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes

Do you think this is the best language code ISO 639-3?

tinohager avatar Oct 04 '22 08:10 tinohager

Well, according to https://iso639-3.sil.org/about/relationships ISO 639-3 was devised to provide a comprehensive set of identifiers for all languages for use in a wide range of applications, including linguistics, lexicography and internationalization of information systems. The page also describes the differences between 639-1, 639-2 and 639-3. So basically the two-letter ISO 639-1 standard is a subset of the three-letter standard ISO 639-3.

I believe that what is the best standard depends on what it should be used for. Is it simply a set of "overall"/"main" languages spoken in the country or is it necessary to have more fine-grained options (an example is arabic)

CasperWSchmidt avatar Oct 13 '22 10:10 CasperWSchmidt

I think that would be useful too! It would be nice to have also 3 letters code for languages just like it is for countries. It would make this library even more complete and robust!

IngBertolini avatar Oct 17 '22 13:10 IngBertolini

Can someone validate the data? Datasource: https://datahub.io/core/language-codes

using System;
using System.Collections.Generic;

public class Program {
    public static void Main() {
        var items = new Dictionary <string, string> ();
        items.Add("aa", "aar");
        items.Add("ab", "abk");
        items.Add("af", "afr");
        items.Add("ak", "aka");
        items.Add("sq", "alb");
        items.Add("am", "amh");
        items.Add("ar", "ara");
        items.Add("an", "arg");
        items.Add("hy", "arm");
        items.Add("as", "asm");
        items.Add("av", "ava");
        items.Add("ae", "ave");
        items.Add("ay", "aym");
        items.Add("az", "aze");
        items.Add("ba", "bak");
        items.Add("bm", "bam");
        items.Add("eu", "baq");
        items.Add("be", "bel");
        items.Add("bn", "ben");
        items.Add("bh", "bih");
        items.Add("bi", "bis");
        items.Add("bs", "bos");
        items.Add("br", "bre");
        items.Add("bg", "bul");
        items.Add("my", "bur");
        items.Add("ca", "cat");
        items.Add("ch", "cha");
        items.Add("ce", "che");
        items.Add("zh", "chi");
        items.Add("cu", "chu");
        items.Add("cv", "chv");
        items.Add("kw", "cor");
        items.Add("co", "cos");
        items.Add("cr", "cre");
        items.Add("cs", "cze");
        items.Add("da", "dan");
        items.Add("dv", "div");
        items.Add("nl", "dut");
        items.Add("dz", "dzo");
        items.Add("en", "eng");
        items.Add("eo", "epo");
        items.Add("et", "est");
        items.Add("ee", "ewe");
        items.Add("fo", "fao");
        items.Add("fj", "fij");
        items.Add("fi", "fin");
        items.Add("fr", "fre");
        items.Add("fy", "fry");
        items.Add("ff", "ful");
        items.Add("ka", "geo");
        items.Add("de", "ger");
        items.Add("gd", "gla");
        items.Add("ga", "gle");
        items.Add("gl", "glg");
        items.Add("gv", "glv");
        items.Add("el", "gre");
        items.Add("gn", "grn");
        items.Add("gu", "guj");
        items.Add("ht", "hat");
        items.Add("ha", "hau");
        items.Add("he", "heb");
        items.Add("hz", "her");
        items.Add("hi", "hin");
        items.Add("ho", "hmo");
        items.Add("hr", "hrv");
        items.Add("hu", "hun");
        items.Add("ig", "ibo");
        items.Add("is", "ice");
        items.Add("io", "ido");
        items.Add("ii", "iii");
        items.Add("iu", "iku");
        items.Add("ie", "ile");
        items.Add("ia", "ina");
        items.Add("id", "ind");
        items.Add("ik", "ipk");
        items.Add("it", "ita");
        items.Add("jv", "jav");
        items.Add("ja", "jpn");
        items.Add("kl", "kal");
        items.Add("kn", "kan");
        items.Add("ks", "kas");
        items.Add("kr", "kau");
        items.Add("kk", "kaz");
        items.Add("km", "khm");
        items.Add("ki", "kik");
        items.Add("rw", "kin");
        items.Add("ky", "kir");
        items.Add("kv", "kom");
        items.Add("kg", "kon");
        items.Add("ko", "kor");
        items.Add("kj", "kua");
        items.Add("ku", "kur");
        items.Add("lo", "lao");
        items.Add("la", "lat");
        items.Add("lv", "lav");
        items.Add("li", "lim");
        items.Add("ln", "lin");
        items.Add("lt", "lit");
        items.Add("lb", "ltz");
        items.Add("lu", "lub");
        items.Add("lg", "lug");
        items.Add("mk", "mac");
        items.Add("mh", "mah");
        items.Add("ml", "mal");
        items.Add("mi", "mao");
        items.Add("mr", "mar");
        items.Add("ms", "may");
        items.Add("mg", "mlg");
        items.Add("mt", "mlt");
        items.Add("mn", "mon");
        items.Add("na", "nau");
        items.Add("nv", "nav");
        items.Add("nr", "nbl");
        items.Add("nd", "nde");
        items.Add("ng", "ndo");
        items.Add("ne", "nep");
        items.Add("nn", "nno");
        items.Add("nb", "nob");
        items.Add("no", "nor");
        items.Add("ny", "nya");
        items.Add("oc", "oci");
        items.Add("oj", "oji");
        items.Add("or", "ori");
        items.Add("om", "orm");
        items.Add("os", "oss");
        items.Add("pa", "pan");
        items.Add("fa", "per");
        items.Add("pi", "pli");
        items.Add("pl", "pol");
        items.Add("pt", "por");
        items.Add("ps", "pus");
        items.Add("qu", "que");
        items.Add("rm", "roh");
        items.Add("ro", "rum");
        items.Add("rn", "run");
        items.Add("ru", "rus");
        items.Add("sg", "sag");
        items.Add("sa", "san");
        items.Add("si", "sin");
        items.Add("sk", "slo");
        items.Add("sl", "slv");
        items.Add("se", "sme");
        items.Add("sm", "smo");
        items.Add("sn", "sna");
        items.Add("sd", "snd");
        items.Add("so", "som");
        items.Add("st", "sot");
        items.Add("es", "spa");
        items.Add("sc", "srd");
        items.Add("sr", "srp");
        items.Add("ss", "ssw");
        items.Add("su", "sun");
        items.Add("sw", "swa");
        items.Add("sv", "swe");
        items.Add("ty", "tah");
        items.Add("ta", "tam");
        items.Add("tt", "tat");
        items.Add("te", "tel");
        items.Add("tg", "tgk");
        items.Add("tl", "tgl");
        items.Add("th", "tha");
        items.Add("bo", "tib");
        items.Add("ti", "tir");
        items.Add("to", "ton");
        items.Add("tn", "tsn");
        items.Add("ts", "tso");
        items.Add("tk", "tuk");
        items.Add("tr", "tur");
        items.Add("tw", "twi");
        items.Add("ug", "uig");
        items.Add("uk", "ukr");
        items.Add("ur", "urd");
        items.Add("uz", "uzb");
        items.Add("ve", "ven");
        items.Add("vi", "vie");
        items.Add("vo", "vol");
        items.Add("cy", "wel");
        items.Add("wa", "wln");
        items.Add("wo", "wol");
        items.Add("xh", "xho");
        items.Add("yi", "yid");
        items.Add("yo", "yor");
        items.Add("za", "zha");
        items.Add("zu", "zul");
    }
}

tinohager avatar Oct 17 '22 20:10 tinohager

Hello! I tried to validate them and they are corret, but it seems that they use 3-letters codes from the ISO 639-2 standard, which uses english-like codes, instead of the ISO 639-3, which i think is more international and standard. @CasperWSchmidt what do you think?

In addition, referring to wikipedia, the code "bh" is deprecated and no longer used (it is also present in the LanguageCode enum) .

These are the codes in the ISO 639-3 standard (without "bh")

var items = new Dictionary<string, string>();
items.Add("aa", "aar");
items.Add("ab", "abk");
items.Add("af", "afr");
items.Add("ak", "aka");
items.Add("sq", "sqi");
items.Add("am", "amh");
items.Add("ar", "ara");
items.Add("an", "arg");
items.Add("hy", "hye");
items.Add("as", "asm");
items.Add("av", "ava");
items.Add("ae", "ave");
items.Add("ay", "aym");
items.Add("az", "aze");
items.Add("ba", "bak");
items.Add("bm", "bam");
items.Add("eu", "eus");
items.Add("be", "bel");
items.Add("bn", "ben");
items.Add("bi", "bis");
items.Add("bs", "bos");
items.Add("br", "bre");
items.Add("bg", "bul");
items.Add("my", "mya");
items.Add("ca", "cat");
items.Add("ch", "cha");
items.Add("ce", "che");
items.Add("zh", "zho");
items.Add("cu", "chu");
items.Add("cv", "chv");
items.Add("kw", "cor");
items.Add("co", "cos");
items.Add("cr", "cre");
items.Add("cs", "ces");
items.Add("da", "dan");
items.Add("dv", "div");
items.Add("nl", "nld");
items.Add("dz", "dzo");
items.Add("en", "eng");
items.Add("eo", "epo");
items.Add("et", "est");
items.Add("ee", "ewe");
items.Add("fo", "fao");
items.Add("fj", "fij");
items.Add("fi", "fin");
items.Add("fr", "fra");
items.Add("fy", "fry");
items.Add("ff", "ful");
items.Add("ka", "kat");
items.Add("de", "deu");
items.Add("gd", "gla");
items.Add("ga", "gle");
items.Add("gl", "glg");
items.Add("gv", "glv");
items.Add("el", "ell");
items.Add("gn", "grn");
items.Add("gu", "guj");
items.Add("ht", "hat");
items.Add("ha", "hau");
items.Add("he", "heb");
items.Add("hz", "her");
items.Add("hi", "hin");
items.Add("ho", "hmo");
items.Add("hr", "hrv");
items.Add("hu", "hun");
items.Add("ig", "ibo");
items.Add("is", "isl");
items.Add("io", "ido");
items.Add("ii", "iii");
items.Add("iu", "iku");
items.Add("ie", "ile");
items.Add("ia", "ina");
items.Add("id", "ind");
items.Add("ik", "ipk");
items.Add("it", "ita");
items.Add("jv", "jav");
items.Add("ja", "jpn");
items.Add("kl", "kal");
items.Add("kn", "kan");
items.Add("ks", "kas");
items.Add("kr", "kau");
items.Add("kk", "kaz");
items.Add("km", "khm");
items.Add("ki", "kik");
items.Add("rw", "kin");
items.Add("ky", "kir");
items.Add("kv", "kom");
items.Add("kg", "kon");
items.Add("ko", "kor");
items.Add("kj", "kua");
items.Add("ku", "kur");
items.Add("lo", "lao");
items.Add("la", "lat");
items.Add("lv", "lav");
items.Add("li", "lim");
items.Add("ln", "lin");
items.Add("lt", "lit");
items.Add("lb", "ltz");
items.Add("lu", "lub");
items.Add("lg", "lug");
items.Add("mk", "mkd");
items.Add("mh", "mah");
items.Add("ml", "mal");
items.Add("mi", "mri");
items.Add("mr", "mar");
items.Add("ms", "msa");
items.Add("mg", "mlg");
items.Add("mt", "mlt");
items.Add("mn", "mon");
items.Add("na", "nau");
items.Add("nv", "nav");
items.Add("nr", "nbl");
items.Add("nd", "nde");
items.Add("ng", "ndo");
items.Add("ne", "nep");
items.Add("nn", "nno");
items.Add("nb", "nob");
items.Add("no", "nor");
items.Add("ny", "nya");
items.Add("oc", "oci");
items.Add("oj", "oji");
items.Add("or", "ori");
items.Add("om", "orm");
items.Add("os", "oss");
items.Add("pa", "pan");
items.Add("fa", "fas");
items.Add("pi", "pli");
items.Add("pl", "pol");
items.Add("pt", "por");
items.Add("ps", "pus");
items.Add("qu", "que");
items.Add("rm", "roh");
items.Add("ro", "ron");
items.Add("rn", "run");
items.Add("ru", "rus");
items.Add("sg", "sag");
items.Add("sa", "san");
items.Add("si", "sin");
items.Add("sk", "slk");
items.Add("sl", "slv");
items.Add("se", "sme");
items.Add("sm", "smo");
items.Add("sn", "sna");
items.Add("sd", "snd");
items.Add("so", "som");
items.Add("st", "sot");
items.Add("es", "spa");
items.Add("sc", "srd");
items.Add("sr", "srp");
items.Add("ss", "ssw");
items.Add("su", "sun");
items.Add("sw", "swa");
items.Add("sv", "swe");
items.Add("ty", "tah");
items.Add("ta", "tam");
items.Add("tt", "tat");
items.Add("te", "tel");
items.Add("tg", "tgk");
items.Add("tl", "tgl");
items.Add("th", "tha");
items.Add("bo", "bod");
items.Add("ti", "tir");
items.Add("to", "ton");
items.Add("tn", "tsn");
items.Add("ts", "tso");
items.Add("tk", "tuk");
items.Add("tr", "tur");
items.Add("tw", "twi");
items.Add("ug", "uig");
items.Add("uk", "ukr");
items.Add("ur", "urd");
items.Add("uz", "uzb");
items.Add("ve", "ven");
items.Add("vi", "vie");
items.Add("vo", "vol");
items.Add("cy", "cym");
items.Add("wa", "wln");
items.Add("wo", "wol");
items.Add("xh", "xho");
items.Add("yi", "yid");
items.Add("yo", "yor");
items.Add("za", "zha");
items.Add("zu", "zul");

IngBertolini avatar Oct 17 '22 21:10 IngBertolini

IMHO the ISO 639-3 standard might as well be used from the beginning if the three-letter codes are added. This will require the opposite relation between two- and three-letter codes though as multiple ISO 639-3 codes maps to the same ISO 639-1 code. Hence the ISO 639-3 codes must be the keys of the dictionary :)

CasperWSchmidt avatar Oct 21 '22 06:10 CasperWSchmidt

So we need this kind of mapping, where one of the ISO 693-1 languages can have multiple local languages. (site for reference)

Do you think that the library should also mangage every single local language or is it enough if the the mapping returns simply the macrolanguage? Example:

ILanguageTranslation language = new TranslationProvider().GetLanguage("aeb");

should return an instance of TunisianArabicLanguageTranslation of is it sufficient that it returns an instance of ArabicLanguageTranslation ?

I think that the second alternative should be fine!

IngBertolini avatar Oct 21 '22 16:10 IngBertolini

I'm not really into the translation stuff, all I care about are the language codes for each country :) But I believe the answer to your question depends on the differences in each "local" language compared to the macro language (fx. Portuguese and Spanish are spoken in both Europe and South America so differences can be significant)

CasperWSchmidt avatar Oct 27 '22 07:10 CasperWSchmidt

Hi, does anyone want to make a suggestion for implementation otherwise I will close the issue?

tinohager avatar Apr 11 '23 21:04 tinohager

I would love to but I'm afraid I have other tasks at hand with hard deadlines ATM :( If you keep it open I might be able to take a stab at it in a few months though...

CasperWSchmidt avatar Apr 12 '23 07:04 CasperWSchmidt