biblatex icon indicating copy to clipboard operation
biblatex copied to clipboard

unclear documentation of the language key.

Open u-fischer opened this issue 10 months ago • 11 comments

The documentation of the language key is imho too vage. It currently says

Languages may be specified literally or as localisation keys. If localisation keys are used, the prefix lang is omissible.

But there is no clear definition what "localisation keys" and what consequences a wrong language value has.

As an example I just came across a document which did set the language in BCP-values and then wondered about an appearing (en) (in en in biblatex-chicago):

\documentclass{article}
\begin{filecontents}[overwrite]{language-test.bib}
@article{testA,
author={Max Muster},
title = {A title},
year={2023},
language={en}
}
@article{testB,
author={Eva Muster},
title = {A title},
year={2023},
language={english}
}
\end{filecontents}


\usepackage[style=authoryear]{biblatex}
\addbibresource{language-test.bib}
\begin{document}
\cite{testA,testB}
\printbibliography
\end{document}

Image

u-fischer avatar Feb 14 '25 12:02 u-fischer

I think the information that is missing here is that the clearlang option (which is true here with the effect that English, being the main language, is omitted from the language list and hence the output) does only work with language keys as specified in ~~table 2~~ sec. 4.9.2.18, Language Names, of the manual, with the lang prefix optionally omitted), not with literal strings (English, en or whatever).

In general, I think the field description should hint at the clearlang option. I was baffled at first why english does not result in any output here (as opposed to, say, language={english and german}).

-- Edit: correct reference

jspitz avatar Feb 16 '25 09:02 jspitz

BTW the clearlang documentation needs some polishing as well. Currently it states that

If this option is enabled, biblatex will automatically clear the language field of all entries whose language matches the babel/polyglossia language of the document

This only happens if that language is the only language in the language field. IMHO this is not apparent from the description. Also, it is not clear that it requires language keys. I think the description should read something like:

If this option is enabled, biblatex will omit the output of the language field for all entries whose only language, as specified in this field via a language key (see sec. 4.9.2.18, Language Names, for supported keys), matches either the main (babel/polyglossia) language of the document or the language specified explicitly with the language option. The purpose of this option is the omission of redundant language specifications.

jspitz avatar Feb 16 '25 10:02 jspitz

I could do a pull request as soon as #1403 has been decided on.

jspitz avatar Feb 16 '25 10:02 jspitz

Thanks for bringing this up. I pushed https://github.com/plk/biblatex/commit/377bf59ad8729561fa5e5cee9816ec1844ad010a, let me know what you think.

Sorry, I don't think I want to make a definite decision on #1403, but at the moment I'm leaning towards a no. As discussed in #1390 I think this would add too much stuff at our end for only a marginal improvement. Of course this could be spun off into a separate package if anyone is super keen about this.

moewew avatar Feb 18 '25 20:02 moewew

Sorry, I don't think I want to make a definite decision on #1403, but at the moment I'm leaning towards a no. As discussed in #1390 I think this would add too much stuff at our end for only a marginal improvement.

I don't see the effort with what I proposed. You do not have to provide any bibstrings, just the ability to use bibstrings. It is a small change with a huge benefit and no drawback AFAICS. (And I really do not agree the localization of locations is a "marginal improvement".)

But your call, of course.

jspitz avatar Feb 19 '25 08:02 jspitz

Hm. I might well be missing the bigger picture, but as I see it at the moment #1403 "only" makes the field formats consider bibstrings. A user will still have to supply a list of known cities and translations, which means they will have to add code to their preamble to support this feature anyway, so they might as well consciously add the code to make the fields bibstring-aware themselves. That's why I think at the moment we're in the "marginal improvement" territory.

As I said, I'd really rather not get into maintaining a huge list of translated city names (for one I'm afraid we might get drawn into political discussions).

moewew avatar Feb 19 '25 18:02 moewew

Yes, that's true, it only makes the field formats consider bibstrings. IMHO this is a significant improvement which is completely backwards compatible.

jspitz avatar Feb 20 '25 07:02 jspitz

Vis-a-vis the amount of work that will take to maintain a db of translated names of cities --- I can easily take on that. I am developing a web tool to make it easier for people to contribute or make their own LBX files and I can easily add cities to an alternate version of it.

The number of publishing-cities in the world are very small and can be easily scrapped off public GIS tools, which are already available in many languages.

If this ever gets implemented, one thing that we should add is an easy way for the user to override an specific name. It would help avoid political discussions about specific names.

Paulo Ney

On Wed, Feb 19, 2025 at 10:21 AM moewew @.***> wrote:

Hm. I might well be missing the bigger picture, but as I see it at the moment #1403 https://github.com/plk/biblatex/pull/1403 "only" makes the field formats consider bibstrings. A user will still have to supply a list of known cities and translations, which means they will have to add code to their preamble to support this feature anyway, so they might as well consciously add the code to make the fields bibstring-aware themselves. That's why I think at the moment we're in the "marginal improvement" territory.

As I said, I'd really rather not get into maintaining a huge list of translated city names (for one I'm afraid we might get drawn into political discussions).

— Reply to this email directly, view it on GitHub https://github.com/plk/biblatex/issues/1410#issuecomment-2669431278, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR7WYUVG5Q7B3KMPLU6E7D2QTDTBAVCNFSM6AAAAABXEPSMRGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNRZGQZTCMRXHA . You are receiving this because you are subscribed to this thread.Message ID: @.***> [image: moewew]moewew left a comment (plk/biblatex#1410) https://github.com/plk/biblatex/issues/1410#issuecomment-2669431278

Hm. I might well be missing the bigger picture, but as I see it at the moment #1403 https://github.com/plk/biblatex/pull/1403 "only" makes the field formats consider bibstrings. A user will still have to supply a list of known cities and translations, which means they will have to add code to their preamble to support this feature anyway, so they might as well consciously add the code to make the fields bibstring-aware themselves. That's why I think at the moment we're in the "marginal improvement" territory.

As I said, I'd really rather not get into maintaining a huge list of translated city names (for one I'm afraid we might get drawn into political discussions).

— Reply to this email directly, view it on GitHub https://github.com/plk/biblatex/issues/1410#issuecomment-2669431278, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR7WYUVG5Q7B3KMPLU6E7D2QTDTBAVCNFSM6AAAAABXEPSMRGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNRZGQZTCMRXHA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

pauloney avatar Feb 20 '25 08:02 pauloney

Overriding bibkeys is straightforward. But maybe we should not occupy this ticket any longer and move back to #1390

jspitz avatar Feb 20 '25 08:02 jspitz

Thanks for bringing this up. I pushed 377bf59, let me know what you think.

I would write

... localization keys (see \secref{aut:lng:key}, especially \secref{aut:lng:key:lng}). If localisation keys are used, the prefix \texttt{lang} is omissible: both language=langenglish and language=english can be used.

Looking a bit at the documentation I came across this sentence:

The base name of the [lbx]-file must be a language name known to the babel/polyglossia packages.

Is that true? Can't I setup a duck.lbx and tell biblatex to use it? And as a more serious question: could I setup things so that biblatex could make proper use of BCP-names like language=en?

u-fischer avatar Feb 20 '25 09:02 u-fischer

I would write

... localization keys (see \secref{aut:lng:key}, especially \secref{aut:lng:key:lng}). If localisation keys are used, the prefix \texttt{lang} is omissible: both language=langenglish and language=english can be used.

Thanks. See https://github.com/plk/biblatex/commit/ee28c25dd62c2339b0269e2d578aefd490314e61.

Looking a bit at the documentation I came across this sentence:

The base name of the [lbx]-file must be a language name known to the babel/polyglossia packages.

Is that true? Can't I setup a duck.lbx and tell biblatex to use it?

Strictly speaking it is not true. You can set up different names and remap them with \DeclareLanguageMapping. But I think it is true enough in the context it is mentioned. Internally biblatex essentially used babel names as far as possible and also loads the files based on that. So at some point you have to use babel identifiers: Either they are you file name and the file is loaded based on that directly or you remap the language you want to your non-standard file name.

And as a more serious question: could I setup things so that biblatex could make proper use of BCP-names like language=en?

Depends on what exactly you have in mind. Making biblatex accept something like language = {en}, would be as easy as providing bibstrings langen, langde etc. It would get more tricky for subtags already. If you wanted to somehow "inherit" our existing bibstrings to the right BCP tag that would be possible with a mapping scheme (which we don't have at the moment).

The following works reasonably well, but requires manual definition of some things

\documentclass[ngerman]{article}
\usepackage[T1]{fontenc}
\usepackage{babel}
\usepackage{csquotes}

\usepackage[backend=biber, style=authoryear, clearlang=true]{biblatex}

\DeclareRedundantLanguages{en}{british,english}
\DeclareRedundantLanguages{de}{german,ngerman}

\NewBibliographyString{langen}
\NewBibliographyString{langde}
\DefineBibliographyStrings{english}{
  langen = {English},
  langde = {German},
}
\DefineBibliographyStrings{german}{
  langen = {Englisch},
  langde = {Deutsch},
}

\begin{filecontents}{\jobname.bib}
@book{elk,
  author    = {Anne Elk},
  title     = {A Theory on Brontosauruses},
  year      = {1972},
  publisher = {Monthy \& Co.},
  location  = {London},
  language  = {en},
}
@book{elk:translated,
  author    = {Anne Elk},
  title     = {Brontosaurier},
  year      = {1980},
  publisher = {Fliegender Zirkus},
  location  = {Hamburg},
  language  = {de},
}
\end{filecontents}
\addbibresource{\jobname.bib}
\addbibresource{biblatex-examples.bib}

\begin{document}
Lorem \autocite{elk,elk:translated}

\printbibliography
\end{document}

If you mean real localization support (what langid and language detection are for) that is more tricky, since we basically use a method that is tailored to how things used to work with babel and made it work in polyglossia by complaining to the polyglossia devs (thanks @jspitz!) until they offered us interfaces that are equivalent. We're already running into trouble with that approach with the new babel .ini files (#1362). Of course if babel and polyglossia of the LaTeX kernel were to offer compatible BCP-47 interfaces that allow us to do all we need to do (not necessarily with the same kind of approach we use at the moment, as that is pretty much just what was available in babel at the time PL wrote this) that would be great and we could probably switch over many things easily and some things with careful consideration.

moewew avatar Feb 20 '25 19:02 moewew