Proper language codes in translated files
We should try to set proper language code in translation files. This includes both README and docs translations.
🐞 Problem As I've already noticed in the issue for Bosnian language, some languages doesn't have it's own code in the file name, so it can cause confusion.
💡 Possible solutions Hello @Roshanjossey, maybe we can list all wrong codes in this issue and give a chance to the community to fix the namings. As a reference, we can use ISO 639 list. If a language has two-letter code, use it. If not, use the three-letter code. We can exclude non-regular languages from this list, like pirate and alien languages.
Hi i would like to work on it
Hey @Roshanjossey, do you agree to change these language codes?
Full list of wrong language codes:
- README.afk.md -> af
- README.al.md - sq
- README.arm.md - hy
- Duplicate of README.hy.md
- README.assamese.md - as
- README.aze.md - az
- README.bih.md - bs
- README.by.md - be
- README.col.md -
es-comaybe? - README.dz.md - arq
- README.ec.md -
es-ecmaybe? - README.eg.md - arz
- README.ewe.md - ee
- README.ge.md - ka
- README.gh.md - tw or ak
- README.gr.md - el
- README.guj.md - gu
- README.hau.md - ha
- README.hb.md - he
- README.igb.md - ig
- README.ka.md - kn
- README.kh.md - km
- README.kr.md - ku
- README.kws.md - sw
- README.kz.md - kk
- README.la.md - lo
- README.lug.md - lg
- README.ma.md - ary
- README.me.md - cnr
- README.mli.md - ???
- README.mm_unicode.md - my
- README.mx.md -
es-mxmaybe? - README.my.md - ms
- README.np.md - ne or npi
- README.od.md - or, ory or ori
- README.pb.md - pa
- README.pt_br.md -
pt-br - README.se.md - sv
- README.sindhi.md - sd
- README.slk.md -
sk- Duplicate of README.sk.md
- README.tm.md - tk
- README.tn.md - aeb
- README.ua.md - uk
- README.vn.md - vi
- README.yor.md - yo
- README.zul.md - zu
Hello @rammba,
I think README.mli.md is in Bambara, so the language code should be bm, I must first say that I don't know the language, I could be wrong, if there is no native speaker to help identify the language, we can use bm for the time being.
I checked the text in README using both Google Translate and tomedes (a website that claims to support 270 languages), and both detected that the text was in Bambara. (No other language identification website could identify this language, I understand why the list uses three question marks now)
Note: This is just a suggestion.
@rammba, thank you for opening this issue.
I agree that we need to standardise language codes. I'll check other existing language code standards too.
One problem I see with this change is existing open issues about things to fix in translation files. All of these have links to which file to change and they will break. @Sharl0tteIsTaken also mentioned this.
My suggestion would be to keep a checklist to track whats updated and what's not and chip away at this gradually. For each language code that's updated, we update corresponding issues and update the checklist. What do you think?
I think README.mli.md is in Bambara, so the language code should be bm
Hey @Sharl0tteIsTaken, since nobody here knows what README.mli.md represents, maybe we can leave it as it is now. Neither of us is a native speaker of Bambara, so we can't prove it's that language unfortunatelly.
For each language code that's updated, we update corresponding issues and update the checklist. What do you think?
Hey @Roshanjossey, my idea was to keep track of all needed changes in this issue, including checking of finished/fixed language codes. You can create separate issues for all the languages and let the community to fix names. Also, you can create issue when other issues for that language are fixed (alt text & bash codes for now). What do you think?
Hey, assing to me. i would like to work on it.
Addition to my previous comment (https://github.com/firstcontributions/first-contributions/issues/105711#issuecomment-3429455018):
Since I'm not native Arabic speaker, maybe we should use different codes for already existing Arabic languages:
- README.dz.md - arq or
ar-dzor to also add Latn script as in README.sr-Latn.md - README.eg.md - arz or
ar-eg - README.ma.md - ary or
ar-maor to also add Latn script as in README.sr-Latn.md - README.tn.md - aeb or
ar-tnor to also add Latn script as in README.sr-Latn.md