Language tags and 880 fields
In regard to internationalization, the logic for applying language tags needs work for parallel-script fields (880), e.g. with translations or parallel titles.
Incorrect Language Tags and Script Subtags For example, problems crop up with OCLC #271414, an English translation of a Russian work.
<http://lib.washington.edu/ld/test/99114652250001452#Work880-45> a bf:Work ;
rdfs:label "Евгений Онегин."@en-cyrl ;
The label is Cyrillic but in Russian, not English.
Work [ a bflc:Relationship ;
bflc:relation [ a bflc:Relation ;
rdfs:label "Container of (expression)"@en-cyrl ] ;
bf:relatedTo <http://lib.washington.edu/ld/test/99114652250001452#Work880-44> ].
The label is English but not Cyrillic. In general, it is vanishingly rare for a string to be both in the English language and in the Cyrillic script.
OCLC # 793950140, a Chinese translation of a Japanese work.
<http://lib.washington.edu/ld/test/99131426860001452#Work> a bf:Text,
bf:Work ;
rdfs:label "Inō Kanori no Taiwan tōsa nikki. Chinese",
"伊能嘉矩の臺湾踏柤日記. Chinese"@zh-hani .
The title in the label is Japanese, not Chinese.
OCLC # 893875561, a Latvian book with a parallel title in Russian.
[ a bf:ParallelTitle,
bf:Title,
bf:VariantTitle ;
rdfs:label "Заяц и его друзья : латышские народные сказки о животных"@lv-cyrl ;
bf:mainTitle "Заяц и его друзья"@lv-cyrl ;
bf:subtitle "латышские народные сказки о животных"@lv-cyrl ]
The title in the label, mainTitle and subtitle is Russian, not Latvian.
Compliance with IETF RFC 5646 Use of language tags should follow the practices given in IETF RFC 5646 [1]. Concerning the script subtag, on page 12 it states “[it] SHOULD be omitted when it adds no distinguishing value to the tag or when the primary or extended language subtag's record in the subtag registry includes a 'Suppress-Script' field listing the applicable script subtag”.
For example, for OCLC # 1779370:
<http://lib.washington.edu/ld/test/99129152590001452#Agent880-32> a bf:Agent,
bf:Jurisdiction ;
rdfs:label "Russia. Министерство народнаго просвѣщенія."@ru-cyrl .
Russian has the Suppress-Script field so a script subtag for Cyrillic is prohibited.
Not Good Practice Using a language tag for numeric data in bf:part is not wrong but probably not a good practice.
<http://lib.washington.edu/ld/test/99129152590001452#Instance880-38> a bf:Instance ;
bf:part "1825-29"@ru-cyrl ;
bf:title [ a bf:Title ;
rdfs:label "Записки"@ru-cyrl ] .
[1] https://tools.ietf.org/html/bcp47
This is complicated since I think some of this is bad data vs bad conversion. We'll investigate and report back.
I've also seen the converter create @ru-cyrl language tags where the -cyrl is redundant and forbidden by BCP 47. I've chosen to ignore them for now.
The specs are going to be updated - pretty sure the best solution is to stop adding tags based on 008+$6.
If the marc included the language with the script it would be different and is technically possible, we were also going to look into that as well.