Stylesheets icon indicating copy to clipboard operation
Stylesheets copied to clipboard

Order of name components no longer imposed

Open HelenaSabel opened this issue 2 years ago • 6 comments

This PR would close #325.

Deleting the templates for forename and surname (the solution presented in the Stylesheets meeting) was problematic on its own because it created a lot of whitespace problems in the outputs. The other changes in common-core.xsl propose a better handling of whitespace (and the addition of periods after the last author in bibliographic references).

I also made a couple of corrections in the test files because I consider that things like this (from test.xml) is wrong:

<name><forename>Charles</forename><surname>Dickens</surname></name>

I therefore added a space between the forename and the surname in this case, and also in test27.xml and in the bibliographic test of Test2.

The test in Test concerning whitespace passed, but the one in Test2 failed (see changes in https://github.com/TEIC/Stylesheets/commit/75fbb0824aacb6785edb047981c47e91e733e5d3). I would like to ask the reviewers to pay special attention to this.

HelenaSabel avatar Sep 30 '23 22:09 HelenaSabel

I also made a couple of corrections in the test files because I consider that things like this (from test.xml) is wrong:

<name><forename>Charles</forename><surname>Dickens</surname></name>

That seems perfectly correct XML to me. (And I daresay, it is probably in test.xml like that just to make sure the Stylesheets do not screw up when a name is encoded like this.)

This encoding says “This is a name; it has two components, a forename and a surname; the forename is ‘Charles’; the surname is ‘Dickens’.”. It makes no assertion about what a processor should do with that information. One quite reasonable approach is to output “Dickens, Charles”. Another is to output “Charles Dickens”. A third is to generate the URL “https://en.wikipedia.org/wiki/Charles_Dickens”. A fourth is to generate the personography key “./authors.xml#Dickens.Charles”. And of course, crunching it into “CharlesDickens” is a viable (if ugly) alternative, too.

sydb avatar Oct 26 '23 01:10 sydb

What concerns me about that is falling again in Western-centric practises, with the Stylesheets separating or reordering components following conventions that are not universal. That’s why I think a space in the source file is relevant because (in my opinion) it indicates that in the case of the name “Charles Dickens”, the name components are not agglutinated (which is not the case in other languages).

HelenaSabel avatar Oct 26 '23 07:10 HelenaSabel

Really good point, but I do not think it invalidates the correctness of

<name><surname>石井</surname><forename>四郎</forename></name>

(Nor the correctness of

<name> <surname>Ishii</surname> <forename>Shirō</forename> </name>

nor of

<name>
  <surname>石井</surname>
  <pc force="strong">&#x20;</pc>
  <forename>四郎</forename>
</name>

.) I do not actually know a culture in which name components are agglutinated (Turkish? Finnish?), but certainly

<name>
  <surname>FAMILY</surname>
  <pc join="both"/>
  <forename>given</forename>
</name>

is worth considering.

Boils down to what used to be thought of as data-centric vs text-centered encoding, I guess.

All that said, just because <name><forename>John</forename><surname>Lennon</surname></name> is a perfectly reasonable encoding does not mean our Stylesheets have to process it.

sydb avatar Oct 26 '23 17:10 sydb

This PR goes back to draft, pending some improvements in the handling of bibliographic references

HelenaSabel avatar Oct 26 '23 19:10 HelenaSabel

Council at VF2F 16 March 2024: We recognize that the changes here are definitely an improvement for a more culturally sensitive processing of parts of a name based on the source encoding. @joeytakeda suggests we should keep the option to access the original template. For users who work with teiGarage, we should be sure this kind of processing is available to them at command line.

@HelenaSabel should update the branch and then request reviewers look at this again to be sure it's okay to merge, and to decide whether to allow the option to access the original template.

ebeshero avatar Mar 16 '24 15:03 ebeshero

I have worked on a project which mixed Icelandic names with other primarily European and North American names in reference lists, and required that they be processed/output differently; eventually, we defaulted to encoding the names in the correct form/order and with any punctuation required, and outputting them as-is. To retain the original behaviour, as @joeytakeda suggests, without causing undue offence, it perhaps the <name> / <persName> elements here should have @xml:lang attributes, or look upwards for the nearest @xml:lang, and be processed accordingly. So a name with "en" might be processed into "surname, forename" while a name with "jp" would go to "forenamesurname", and a name with "is" would go to "forename surname".

martindholmes avatar Jun 29 '25 15:06 martindholmes