Support ISO 639 languages
What this PR does / why we need it: Some codes are still not managed. In the cases encountered, frm (Medieval French) and fro (Old French).
Which issue(s) this PR closes: 8578
Closes #8578
Special notes for your reviewer:
Suggestions on how to test this:
Does this PR introduce a user interface change? If mockups are available, please link/include them here: No. Only new languages in the list of languages
Is there a release notes update needed for this change?: Yes. to be included in this PR
Additional documentation:
coverage: 20.726% (-0.02%) from 20.741% when pulling 820ff3323e5b8091215b517883cd5829ccd4c780 on 8578-support-extended-iso-639-languages into 0d279573bb8d7c96d7a4a1dc4b66b2258059dfba on develop.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
Includes 7900+ languages with no degradation to the UI.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
@stevenwinship There are 3 extra .tab files in scripts/api/data/metadatablocks/iso-639-3_Code_Tables_20240415 (iso-639-3-macrolanguages.tab etc.). Are these checked in for reference purposes? (or, are they checked in on purpose?)
@stevenwinship There are 3 extra .tab files in
scripts/api/data/metadatablocks/iso-639-3_Code_Tables_20240415(iso-639-3-macrolanguages.tabetc.). Are these checked in for reference purposes? (or, are they checked in on purpose?)
They were part of the origin zip file. I don't think we need them so I'll delete them
:package: Pushed preview images as
ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages
:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.
Honestly, I'm not a fan of having a new mechanism and an API call dedicated to parsing this particular language-specific file format, particularly since it relies on one citation.properties file that has to cover both (plus the language-specific properties file variants.) I also think the current code has problems with the implementation of the merge of existing and new entries (see comments). Some of those are probably fixable in code (not sure about all - see the Pular example), but wouldn't exist if we didn't try an auto-merge.
It may also be a pain when the citation block is updated. If that happens and you reload the block (say due to some non-language change to some unrelated metadata field), then reloading the block will restore just the alternates from the block and drop any from the ISO file. So unless we add the step of using the new API to update from the ISO file as well to the release notes, we'd have unintended changes.
I do like the idea of this being optional/not a change to the citation block that everyone has to adopt, but I'm not sure doing it with a separate API and merging is worth it (versus, for example, a Javascript that might allow filtering to common/complete lists for selection).
In the light of having confirmed that having the full list in the CV does not make the UI unusable, I would at least consider adding it in full to citation.tsv (and just giving up on the optional aspect of the expansion). My $0.02?
This is being replaced by #10762, so this should close.