Split title documentation (for 1.1)
This adds documentation for the title-splitting feature to be added in 1.1. I'm not sure it's in the right place. (I actually am quite sure that it isn't, but I don't really know where to put it. Should, e.g. the attributes be added to the sections on style behavior? Only there? Or as well?)
One thing that still needs to be discussed: In the proposal I've had this paragraph regarding: As some locales prefer en to em dashes citeprocs should check against both if the "full" options are selected on normalize-title-delimiters and/or title-split. Should I add that somewhere, or should I just add "–" (an en dash) to the relevant options in the schema and in the documentation.
I was thinking we might want to add a new section for input data, and this could primarily go there, along with the rich text stuff, and dates?
I was thinking we might want to add a new section for input data, and this could primarily go there, along with the rich text stuff, and dates?
Yes, this. We also probably need a whole new section in the spec to describe how title processing works, similar to the section on names and dates being split out.
I should better convert this to draft...
@bdarcus If we want to simplify these rules, we could the delimiter sets down to simple (. , : , :: , ? , ! ) and extended (simple plus ; ). Em dash separation of subtitles versus colons is maybe rare enough that we could require users to split that manually. Chicago style is another rare case that could be left to users to split.
We could even eliminate extended if we want. ; is mostly used to delimit multiple subtitles. We could ignore that case or bake in that logic and ignore the rare styles that don't respect that convention.
That could eliminate the title-split attribute and just leave one of these options for spec instructions:
Ignore the ; split for subtitles:
Citeprocs split title variables into "main" and "sub" forms. Split-points can be explicitly provided in title variables by separating chunks of a title with two vertical bars:
Main Title:|| first subtitle:|| second subtitle. The "main" form is the text before the first delimiter, the "sub" form is an array of the text following each delimiter. If no split-points are supplied in the data, the citeproc will derive them by splitting the title on the following patterns:.,:,::,!,?
Split subtitles on ; :
Citeprocs split title variables into "main" and "sub" forms. Split-points can be explicitly provided in title variables by separating chunks of a title with two vertical bars:
Main Title:|| first subtitle:|| second subtitle. The "main" form is the text before the first delimiter, the "sub" form is an array of the text following each delimiter. If no split-points are supplied in the data, the citeproc will derive them. The main title is separated from any subtitles by splitting the title on the first of the following patterns:.,:,::,!,?. Multiple subtitles are separated by splitting the remaining title string on on the following patterns:.,:,::,!,?,;.
We could even eliminate extended if we want.
;is mostly used to delimit multiple subtitles.
Couldn't we just add ; to the list of delimiters and be done? Like:
If no split-points are supplied in the data, the citeproc will derive them by splitting the title on the following patterns:
.,:,;,::,!,?
A semicolon in a title that doesn't serve as a delimiter should be rare enough anyway, right?
Probably.
Chicago style splits and punctuation are now also endorsed by MLA:
For an alternative or double title in English beginning with or, we follow the first example given in section 8.165 of The Chicago Manual of Style and punctuate as follows: England’s Monitor; or, The History of the Separation (452) But no semicolon is needed for a title in English that ends with a question mark or exclamation point: “Getting Calliope through Graduate School? Can Chomsky Help? or, The Role of Linguistics in Graduate Education in Foreign Languages”
https://style.mla.org/punctuation-with-titles/
Let's change the relevant pattern to ; or, and recognize it always.
By the way, why do we have :: in the pattern list? What's that for?
It was in Frank's list, I think because a lot of library catalog data explicitly use :: to separate main and subtitles. IIRC correctly, he normalizes that to just one colon. We could drop it I think (: will still match that obviously).
It was in Frank's list, I think because a lot of library catalog data explicitly use
::to separate main and subtitles. IIRC correctly, he normalizes that to just one colon. We could drop it I think (:will still match that obviously).
Hmmm, we, at least, use : to separate main and subtitle.
Concering ; or,: Couldn't we just include the complete Chicago pattern here? I don't think that will be to problematic as this should only affect subtitle casing. (And you wouldn't usually replace such a delimiter.)
How does APA deal with this?
An edge case regarding Chicago style splits. Let's say you have: "A very important title; or: This book is important"
Even if you normalize colons to periods in other cases, here you will not want "A very important title; or. This book is important" but to keep the colon, so: "A very important title; or: This book is important"
Here are the three examples from the current Chicago manual, with the first being preferred nowadays:
The Tempest, or The Enchanted Island Moby-Dick; or, The Whale Dr. Strangelove, or: How I Learned to Stop Worrying and Love the Bomb
The first we don't bother with. That would be need to be entered as The tempest, or ||the enchanted island or as The tempest, or The enchanted island.
The third is handed by the normal colon rules.
The second is the only one that needs to be handled. This regex pattern should work for that /; or,?/ (? here indicates the comma is optional). With ; being in the main split list, the only thing that needs to be accommodated is to not capitalize "or".
Just noting here: Chicago and MLA describe those examples (the parts after the "or") as "alternative" or "double" titles.
In an object representation they'd have separate properties.
I think that’s more a semantic description. In most data they are going to be entered as a flat title (e.g., especially the Dr. Strangelove title).
The third is handed by the normal colon rules.
But shouldn't we make sure the colon does not gets replaced here?
I think that’s more a semantic description. In most data they are going to be entered as a flat title (e.g., especially the Dr. Strangelove title).
I don't think the latter point is relevant, but I'll save that for another thread.
On the first point, it's exactly how it's described in the style guides.
From here:
For an alternative or double title in English beginning with or, we follow the first example given in section 8.165 of The Chicago Manual of Style and punctuate as follows ...
Yes, I think that's more a semantic description of the type of subtitle, rather than a description of how to expect these to appear in item data.
Here is the full description from the Chicago manual:
14.91 Use of "or" with double titles. Old-fashioned double titles (or titles and subtitles) connected by or have traditionally been separated by a semicolon (or sometimes a colon), with a comma following or, or more simply by a single comma preceding or. (Various other combinations have also been used.) When referring to such titles, prefer the punctuation on the title page or at the head of the original source. In the absence of such punctuation (e.g., when the title is distinguished from the subtitle by typography alone), or when the original source is not available to consult, use the simpler form shown in the first example. This departure from earlier editions recognizes the importance of balancing editorial expediency with fidelity to original sources. The second example preserves the usage on the original title pages of the American and British editions of Melville’s classic novel (and assumes one of those editions, or a later edition that preserves such punctuation, was in fact consulted). The third example (of a modern film) preserves the colon of the original title sequence but adds a comma to separate the main title from the secondary title (distinguished only graphically in the original). In all cases, the first word of the subtitle (following or) should be capitalized. See also 14.87, 14.88.
The Tempest, or The Enchanted Island but Moby-Dick; or, The Whale Dr. Strangelove, or: How I Learned to Stop Worrying and Love the Bomb
But shouldn't we make sure the colon does not gets replaced here?
We could. In that case, listing these separately would be best: /; or,? / and /, or: /
I don't disagree that it's semantic; I just wanted to make clear it wasn't my interpretation of the rule descriptions.