gramps icon indicating copy to clipboard operation
gramps copied to clipboard

Bug 12772: Add support for ISO 639-3 part 3 standard language code

Open hgohel opened this issue 1 year ago • 8 comments

This change is being committed on behalf of @jocelyn who submitted a patch in the Gramps Bug Tracker #12772.

@lordemannd please test & validate. Thanks!

hgohel avatar Sep 16 '23 19:09 hgohel

I'm not sure that I like 8000 lines of code defining a list.

Where does the data come from? How do we maintain it?

Consider downloading the iso-639-3.tab file from the SIL site and reading it at startup in the const.py file.

Use code similar to the following:

with open("iso-639-3.tab", "r") as iso_file:
    ISO_CODES = []
    ISO_CODES_3 = []
    ISO_CODES_3TO2 = []
    for line in iso_file:
        fields = line.split("\t")
        ISO_CODES_3.append(fields[0])
        if fields[3]:
            ISO_CODES.append(fields[3])
            ISO_CODES_3TO2.append((fields[0], fields[3]))

Nick-Hall avatar Sep 16 '23 21:09 Nick-Hall

I'm not sure that I like 8000 lines of code defining a list.

Where does the data come from? How do we maintain it?

If you're open to a new dependency, pycountry provides many of these ISO data sets.

QuLogic avatar Sep 16 '23 22:09 QuLogic

for testing on Windows, is GrampsAIO-5.2.0-r1-95a8bd3_win64.exe dated aug 31, 2023 the right place to start? (plus this PR)

lordemannd avatar Sep 17 '23 17:09 lordemannd

@Nick-Hall

I'm not sure that I like 8000 lines of code defining a list. Where does the data come from? How do we maintain it?

Agreed! Git history shows that the file was reformatted to have one list entry per line by the Black formatter, although even if it wasn't, your questions remain valid, so it makes sense to do something better.

Consider downloading the iso-639-3.tab file from the SIL site and reading it at startup in the const.py file.

Using the code you provided this should be fairly easy. Only issue I'm having is getting the app to find iso-639-3.tab. I placed it in data/iso-639-3.tab and was expecting the build process to copy it to resources directory but I'm missing something. MANIFEST.in implies that everything in the data/ directory should be copied but it hasn't worked so far. Any suggestions?

@lordemannd Looks like this might undergo some more revisions before the code is ready for testing.

hgohel avatar Sep 18 '23 02:09 hgohel

If you're open to a new dependency, pycountry provides many of these ISO data sets.

PyCountry looks good, although there is a concern about lack of commits in the last year; an alternate project, isocodes was suggested which looks to be actively maintained.

hgohel avatar Sep 18 '23 03:09 hgohel

@Nick-Hall

Only issue I'm having is getting the app to find iso-639-3.tab. I placed it in data/iso-639-3.tab and was expecting the build process to copy it to resources directory but I'm missing something. MANIFEST.in implies that everything in the data/ directory should be copied but it hasn't worked so far. Any suggestions?

Turns out I was looking in the wrong directory (doc instead of data) when opening it. I have the code running, will create a separate PR for an implementation based on your suggestion to read iso-639-3.tab. But I'm still not sure if something needs to be done to make sure it gets bundled into the installer.

hgohel avatar Sep 18 '23 03:09 hgohel

The python3-pycountry package seems to be widely available. I checked MSYS2, Arch, Debian and Fedora. Is the validation of the language field important? Perhaps we could allow the python3-pycountry package to be optional.

To include the iso-639-3.tab file in the distribution you will need to add a line in setup.py.

Nick-Hall avatar Sep 18 '23 21:09 Nick-Hall

If you're open to a new dependency, pycountry provides many of these ISO data sets.

PyCountry looks good, although there is a concern about lack of commits in the last year; an alternate project, isocodes was suggested which looks to be actively maintained.

Hi, indeed my project is actively maintained and has typing support. :)

Atem18 avatar Oct 03 '23 14:10 Atem18

Closing in favour of PR #1706.

I decided to use the pycountry module, as it is widely available and seems to be actively supported now.

Nick-Hall avatar Apr 16 '24 18:04 Nick-Hall