dataverse icon indicating copy to clipboard operation
dataverse copied to clipboard

Support ISO 639 languages

Open stevenwinship opened this issue 1 year ago • 6 comments

What this PR does / why we need it: Some codes are still not managed. In the cases encountered, frm (Medieval French) and fro (Old French).

Which issue(s) this PR closes: 8578

Closes #8578

Special notes for your reviewer:

Suggestions on how to test this:

Does this PR introduce a user interface change? If mockups are available, please link/include them here: No. Only new languages in the list of languages

Is there a release notes update needed for this change?: Yes. to be included in this PR

Additional documentation:

stevenwinship avatar May 22 '24 18:05 stevenwinship

Coverage Status

coverage: 20.726% (-0.02%) from 20.741% when pulling 820ff3323e5b8091215b517883cd5829ccd4c780 on 8578-support-extended-iso-639-languages into 0d279573bb8d7c96d7a4a1dc4b66b2258059dfba on develop.

coveralls avatar May 22 '24 18:05 coveralls

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar May 22 '24 18:05 github-actions[bot]

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar May 22 '24 18:05 github-actions[bot]

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar May 23 '24 15:05 github-actions[bot]

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar May 29 '24 18:05 github-actions[bot]

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar May 29 '24 19:05 github-actions[bot]

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar Jul 30 '24 17:07 github-actions[bot]

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar Jul 30 '24 17:07 github-actions[bot]

Includes 7900+ languages with no degradation to the UI.

stevenwinship avatar Jul 30 '24 18:07 stevenwinship

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar Jul 31 '24 14:07 github-actions[bot]

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar Aug 02 '24 15:08 github-actions[bot]

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar Aug 02 '24 15:08 github-actions[bot]

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar Aug 02 '24 15:08 github-actions[bot]

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar Aug 02 '24 15:08 github-actions[bot]

@stevenwinship There are 3 extra .tab files in scripts/api/data/metadatablocks/iso-639-3_Code_Tables_20240415 (iso-639-3-macrolanguages.tab etc.). Are these checked in for reference purposes? (or, are they checked in on purpose?)

landreev avatar Aug 02 '24 19:08 landreev

@stevenwinship There are 3 extra .tab files in scripts/api/data/metadatablocks/iso-639-3_Code_Tables_20240415 (iso-639-3-macrolanguages.tab etc.). Are these checked in for reference purposes? (or, are they checked in on purpose?)

They were part of the origin zip file. I don't think we need them so I'll delete them

stevenwinship avatar Aug 02 '24 19:08 stevenwinship

:package: Pushed preview images as

ghcr.io/gdcc/dataverse:8578-support-extended-iso-639-languages
ghcr.io/gdcc/configbaker:8578-support-extended-iso-639-languages

:ship: See on GHCR. Use by referencing with full name as printed above, mind the registry name.

github-actions[bot] avatar Aug 02 '24 20:08 github-actions[bot]

Honestly, I'm not a fan of having a new mechanism and an API call dedicated to parsing this particular language-specific file format, particularly since it relies on one citation.properties file that has to cover both (plus the language-specific properties file variants.) I also think the current code has problems with the implementation of the merge of existing and new entries (see comments). Some of those are probably fixable in code (not sure about all - see the Pular example), but wouldn't exist if we didn't try an auto-merge.

It may also be a pain when the citation block is updated. If that happens and you reload the block (say due to some non-language change to some unrelated metadata field), then reloading the block will restore just the alternates from the block and drop any from the ISO file. So unless we add the step of using the new API to update from the ISO file as well to the release notes, we'd have unintended changes.

I do like the idea of this being optional/not a change to the citation block that everyone has to adopt, but I'm not sure doing it with a separate API and merging is worth it (versus, for example, a Javascript that might allow filtering to common/complete lists for selection).

qqmyers avatar Aug 05 '24 20:08 qqmyers

In the light of having confirmed that having the full list in the CV does not make the UI unusable, I would at least consider adding it in full to citation.tsv (and just giving up on the optional aspect of the expansion). My $0.02?

landreev avatar Aug 07 '24 14:08 landreev

This is being replaced by #10762, so this should close.

qqmyers avatar Aug 14 '24 19:08 qqmyers