country-coder ISO 639 Language Codes

Heya,

Would you be open to accepting a PR to add 'Official Languages' per-region?

I hunted around for an official source & didn't find many decent ISO 3166 <> 639 mapping files.

This file from geonames contains a bunch of extra fields we could adopt, including the languages (which are correct from my spot checking): http://download.geonames.org/export/dump/countryInfo.txt

Another option would be to use a locale dataset from a linux distro, I feel like they would be fairly complete and well maintained.

Please let me know if that's something you'd accept.

Sep 06 '23 11:09 missinglink

Maybe? I’m not sure how a language-per-region dataset would be useful in the context of OpenStreetMap editing?

Looping in @1ec5 too, as he knows more about this than I do.

Sep 06 '23 12:09 bhousel

I think that's a fair point, I'm not actually planning on using the library for Rapid (or OSM in many cases), but I find the simplified polygons quite useful and would prefer to collaborate on them rather than have another repo to maintain.

There are some other fields in that geonames download which might be helpful for some, but not for everyone. If that's the case I could put these extra fields in separate files and have tree shaking remove what's not used.

Sep 06 '23 12:09 missinglink

Thanks! Sorry I commented kind of quickly before.. We can definitely add data even if it isn't used in Rapid or OSM - I'm more asking for clarification because it's probably not data that I would use myself.

"Spoken/official" languages was mentioned in https://github.com/rapideditor/country-coder/issues/4#issuecomment-561609300 too.

Sep 06 '23 13:09 bhousel

Here's the list BTW https://gist.github.com/missinglink/ebe8ba69e58dbdfb47750f6079364ecc

Sep 06 '23 13:09 missinglink

Minh has a point about maybe needing more granular shapes in some areas, eg. Switzerland.

Sep 06 '23 13:09 missinglink

I think any addition of language codes would need to come with a caveat emptor. Every use case requires a different mapping from countries to languages. Official languages don’t necessarily say anything about the name language in OSM, or the language that users in that country are searching in. iD maintains a mapping for determining the name:* fields to show by default, which started from CLDR but required some tweaks afterwards based on user feedback. This is a pretty limited use of the data, since users can easily customize the language list.

Sep 06 '23 16:09 1ec5