biblatex icon indicating copy to clipboard operation
biblatex copied to clipboard

lowercase of "translated and commented ..." after full stop

Open ClintEastwood opened this issue 10 years ago • 20 comments

Consider this entry:

@book{aristotle:makin:2006,
    Author = {Aristotle},
    Commentator = {Stephen Makin},
    Introduction = {Stephen Makin},
    Location = {Oxford},
    Publisher = {Clarendon Press},
    Series = {Clarendon Aristotle Series},
    Subtitle = {Book Θ},
    Title = {Metaphysics},
    Translator = {Stephen Makin},
    Year = {2006}}

Note that the subtitle ends with a Greek letter "Θ".

Run this through \usepackage[backend=biber,style=authortitle-comp,abbreviate=false]{biblatex}

and you will get the following output:

Aristotle. Metaphysics. Book Θ. translated and commented, with an introduction, by Stephen Makin. Clarendon Aristotle Series. Oxford: Clarendon Press, 2006.

Notice the lowercase "t" after the dot after the subtitle. That does not seem right. If I write "Theta" instead of "Θ", everything is fine.

Thanks for your good work!

ClintEastwood avatar Feb 04 '15 18:02 ClintEastwood

The following MWE reproduces the problem and shows another very interesting effect

\documentclass{article}
\usepackage{fontspec}
\usepackage[style=verbose]{biblatex}
\usepackage{filecontents}

\begin{filecontents*}{\jobname.bib}
@book{a,
    Author = {Aristotle},
    Title = {A},
    Translator = {Stephen Makin},
    Year = {2006}}
@book{b,
    Author = {Aristotle},
    Title = {Akela},
    Translator = {Stephen Makin},
    Year = {2006}}
@book{c,
    Author = {Aristotle},
    Title = {Θ},
    Translator = {Stephen Makin},
    Year = {2006}}
\end{filecontents*}

\addbibresource{\jobname.bib}

\begin{document}
\cite{a}

\cite{b}

\cite{c}

\printbibliography
\end{document}

Gives in text

Aristotle. A. trans. by Stephen Makin. 2006 Aristotle. Akela. Trans. by Stephen Makin. 2006 Aristotle. Θ. trans. by Stephen Makin. 2006

and in the references

Aristotle. A. Trans. by Stephen Makin. 2006. — Akela. Trans. by Stephen Makin. 2006. — Θ. trans. by Stephen Makin. 2006.

Where you can see that the citation with only "A" as title has the issue only in the citation, the one with "Akela" has no problem at all and the one with "Θ" has the issue in citations as well as the bibliography. Very mysterious.

moewew avatar Aug 28 '15 14:08 moewew

Doing some more research into this, I found that adding \@ to the titles seems to relieve us of all our troubles.

Is this something that could (and should) be done automatically?

See the very well working example below, where now everything is OK except for the output of e (which I left unchanged on purpose to show the effect of \@)

\documentclass{article}
\usepackage{fontspec}
\usepackage[style=verbose]{biblatex}
\usepackage{filecontents}

\begin{filecontents*}{\jobname.bib}
@book{a,
    Author = {Aristotle},
    Title = {A\@},
    Translator = {Stephen Makin},
    Year = {2006}}
@book{b,
    Author = {Aristotle},
    Title = {Akela},
    Translator = {Stephen Makin},
    Year = {2006}}
@book{c,
    Author = {Aristotle},
    Title = {Θ\@},
    Translator = {Stephen Makin},
    Year = {2006}}
@collection{d,
    Editor = {Aristotle},
    Title = {D},
    Translator = {Stephen Makin},
    Year = {2006}}
@book{e,
    Editor = {Aristotle},
    Title = {E},
    Translator = {Stephen Makin},
    Year = {2006}}
\end{filecontents*}

\addbibresource{\jobname.bib}

\DeclareFieldFormat[collection]{title}{\mkbibemph{#1}\@}

\begin{document}
\cite{a}

\cite{b}

\cite{c}

\cite{d}

\cite{e}

\printbibliography
\end{document}

Gives now in-text

Aristotle. A. Trans. by Stephen Makin. 2006 Aristotle. Akela. Trans. by Stephen Makin. 2006 Aristotle. Θ. Trans. by Stephen Makin. 2006 Aristotle, ed. D. Trans. by Stephen Makin. 2006 Aristotle, ed. E. trans. by Stephen Makin. 2006

and in the references

Aristotle. A. Trans. by Stephen Makin. 2006. — Akela. Trans. by Stephen Makin. 2006. — ed. D. Trans. by Stephen Makin. 2006. — ed. E. Trans. by Stephen Makin. 2006. — Θ. Trans. by Stephen Makin. 2006.

moewew avatar Sep 06 '15 16:09 moewew

@josephwright, @aboruvka - this is an interesting suggestion - \@ is clearly correct here and perhaps should be default except that this does depend somewhat on the assumptions of the default styles.

plk avatar Sep 06 '15 17:09 plk

I found that adding \@ to \blx@qp@period also gives the desired effect.

\makeatletter
\csdef{blx@qp@period}{\spacefactor\@m\blx@postpunct}
\makeatother

Then there would be no need to insert the \@ into any field formats.

The question Spurious punctuation in xelatex + biblatex + Cyrillic or Greek on TeX.SX seems to be at least related to the issue here.

moewew avatar Oct 24 '15 14:10 moewew

I'm quite confident this problem here is actually the same as in https://github.com/plk/biblatex/issues/368. I don't think the underlying problem is solved by sprinkling \@s in field format definitions.

moewew avatar Mar 12 '16 15:03 moewew

Can you please try this with the current 3.4 DEV version (and biber 2.5 DEV version if you use biber) - I believe this might be fixed.

plk avatar Apr 12 '16 20:04 plk

Doesn't seem to be resolved. I've got this with @moewew's MWE:

Aristotle. A. trans. by Stephen Makin. 2006 Aristotle. Akela. Trans. by Stephen Makin. 2006 Aristotle. Θ. trans. by Stephen Makin. 2006

Aristotle. A. Trans. by Stephen Makin. 2006. — Akela. Trans. by Stephen Makin. 2006. — Θ. Trans. by Stephen Makin. 2006.

But if I make the main language Russian or Greek with for example

\newfontfamily\greekfont{CMU Serif}
\usepackage{polyglossia}
\setdefaultlanguage{greek}

the capitalization is mysteriously correct.

odomanov avatar Apr 13 '16 05:04 odomanov

Capitalisation works fine in the bibliography now. In the citations it doesn't work and as the MWE above shows, it doesn't even (and didn't) work for Latin letters either. I guess that setazcodes is not called in citations leaving them with the standard category codes.

moewew avatar Apr 13 '16 07:04 moewew

I can't claim to have understood the issue fully, but I found that only \blx@setfrcodes sets the sfcodes for capital letters, \blx@setencodes doesn't. From what I can see, \blx@setfrcodes is used if \frenchspacing is active, \blx@setencodes if \nonfrenchspacing; they are called in \blx@setsfcodes.

In the bibliography \frenchspacing is turned on by default, \blx@setfrcodes is called and everything is fine. But in the citations/text we use \blx@setencodes if that is not overridden by either an explicit \frenchspacing or babel/polyglossia's language settings.

One could probably just move the \blx@setazcodes bit directly into \blx@setsfcodes

\def\blx@setsfcodes{%
  \let\blx@setsfcodes\relax
  \let\frenchspacing\blx@setfrcodes
  \let\nonfrenchspacing\blx@setencodes
  \ifnum\sfcode`\.>2000
    \blx@setencodes
  \else
    \blx@setfrcodes
  \fi
  \ifnum\sfcode`\A=\@m
  \else
    \blx@setazcodes
  \fi
  \@setquotesfcodes
  \sfcode`\(=\z@
  \sfcode`\)=\z@
  \sfcode`\[=\z@
  \sfcode`\]=\z@
  \sfcode`\<=\z@
  \sfcode`\>=\z@}

\def\blx@setencodes{%
  \sfcode`\,=1250
  \sfcode`\;=1500
  \sfcode`\:=2000
  \sfcode`\.=3000
  \sfcode`\!=3001
  \sfcode`\?=3002
}

\def\blx@setfrcodes{%
  \sfcode`\,=\blx@sf@comma
  \sfcode`\;=\blx@sf@semicolon
  \sfcode`\:=\blx@sf@colon
  \sfcode`\.=\blx@sf@period
  \sfcode`\!=\blx@sf@exclam
  \sfcode`\?=\blx@sf@question
}

alternatively

  \ifnum\sfcode`\A=\@m
  \else
    \blx@setazcodes
  \fi

can be replicated in both \blx@setencodes and \blx@setfrcodes.

moewew avatar Apr 13 '16 07:04 moewew

I'm pretty sure PL set things up as-is deliberately. It's safe to alter the \sfcode values for capital letters when \frenchspacing is active as there is 'nothing to do': spacing after all periods is the same. However, if you alter the \sfcode values and \nonfrenchspacing is active then you'll suddenly get larger spaces after abbreviated capital letters. One might I guess arrange that \frenchspacing applies within a citation, but I'd need to check the grouping is correct.

josephwright avatar Apr 13 '16 10:04 josephwright

@josephwright - do you think you will have time to look at this?

plk avatar Apr 14 '16 18:04 plk

I would have thought that \frenchspacing would be appropriate in citations - I can't imagine why one would want anything else.

plk avatar Apr 14 '16 20:04 plk

@plk 'Inside' a citation \frenchspacing is probably fine. The issue comes when the citation ends. If we start with the simple example

\documentclass{article}
\begin{document}
{\nonfrenchspacing Hello world.}\showthe\spacefactor
{\frenchspacing Hello world.}\showthe\spacefactor
\end{document}

you'll see that the space factor is that within the group not outside it. Most of the time punctuation is going to be outside a citation, but if the citation ends in a . or if we have

`\autocite{key}.`

and 'auto-magic' moving of citations (which puts the . inside the citation group) then we'll be wrong in a \nonfrenchspacing document. I guess I might be able to come up with a solution to the latter example, but the general 'what if the last char is a ./!/?' problem isn't fixable. That's probably as likely to happen 'in the wild' as the problem we are asked about here.

josephwright avatar Apr 15 '16 06:04 josephwright

Worth noting here is that there is a link to the entire business of being able to 'dump' formatted bibliographies. Currently biblatex does not keep the formatted data 'around' but passes it to TeX for typesetting as it is generated. That means that we can't tell that a piece of punctuation is the last generated by a bibliography driver (so reset \sfcode values) unless it comes right at the end of the driver. Arranging that everything is 'held' in a structured form would avoid that but would be an entire re-write and would be pretty tricky when you start imagining all of the possible forms of the driver. (that's before the entire business that in the document body reading back such data looks very complex and although possibly addressable is at least in part why traditional BibTeX simply can't do some of this stuff). Even if everything is 'held' you still need to track very carefully where punctuation comes from such that \sfcode values are be manipulated. It really looks very tricky (on the border of what can be done at the TeX level).

josephwright avatar Apr 15 '16 08:04 josephwright

Is there anything else we need to do here?

plk avatar Sep 25 '16 13:09 plk

Thanks for all your work. As to plk's question: Well, I don't know. moewew's MWE still has in one instance a non-capitalised "trans."

Aristotle. A. Trans. by Stephen Makin. 2006 Aristotle. Akela. Trans. by Stephen Makin. 2006 Aristotle. Θ. Trans. by Stephen Makin. 2006 Aristotle, ed. D. Trans. by Stephen Makin. 2006 Aristotle, ed. E. trans. by Stephen Makin. 2006 <-- HERE References Aristotle. A. Trans. by Stephen Makin. 2006. — Akela. Trans. by Stephen Makin. 2006. — ed. D. Trans. by Stephen Makin. 2006. — ed. E. Trans. by Stephen Makin. 2006. — Θ. Trans. by Stephen Makin. 2006.

ClintEastwood avatar Sep 26 '16 07:09 ClintEastwood

@josephwright - any comment?

plk avatar Sep 26 '16 08:09 plk

@moewew, @josephwright - do we have any potential solution to this?

plk avatar Jan 12 '17 23:01 plk

As noted earlier, addressing this would be pretty much a rewrite of the entire package.

josephwright avatar Jan 13 '17 10:01 josephwright

This came up again in https://github.com/plk/biblatex/issues/776. Could we remedy some symptoms of this by increasing the \spacefactor from 999 to 1000 at the end of printing a field?

moewew avatar Aug 04 '18 08:08 moewew