schema icon indicating copy to clipboard operation
schema copied to clipboard

Sort while dropping articles

Open bwiernik opened this issue 3 months ago • 3 comments

We occasionally get requests to sort titles or other text while dropping articles like "the" or "a". https://forums.zotero.org/discussion/123127/virtual-csl-editor-how-to-sort-by-title-skipping-the-article

@adam3smith @adunning

A few considerations:

  1. How common is this sort of requirement? Is it worth investing in implementing?

  2. Do non-English styles have similar rules or could we implement this only for English (like we do with casing)?

  3. When a style in English calls for this, do they apply language-specific article dropping rules to non-English items that are cited, or do use the same English rules (or ignore)? What about styles in other languages/multilingual styles?

bwiernik avatar Sep 03 '25 22:09 bwiernik

  1. How common is this sort of requirement? Is it worth investing in implementing?

As far as I know, the BibTeX standard style plain.bst strips leading article in title before sorting and this behavior is common in English BibTeX styles.

The citeproc-js has a relevant function strip_prepositions() (perhaps incorrectly named) but there is a bug in its implementation. It assumes the article is followed by a space in sorted keys but actually the space is already replaced (node_key.js#L197) with sort_sep (| in most cases, build.js#L185-L189) before the function is called.

zepinglee avatar Sep 04 '25 06:09 zepinglee

I cannot generally tell how common that is, but at least Chicago requires that kind of sorting:

In the notes and bibliography system (as opposed to author-date; see 13.112), titles by the same author are normally listed alphabetically. (When the entry includes a chapter or article title before its associated book or periodical title, the first title determines alphabetical order.) An initial the, a, or an is ignored in the alphabetizing (as in the first Squire entry below). Note that all works by the same person (or by the same persons in the same order)—whether that person is editor, author, translator, or compiler—appear together, regardless of the added abbreviation. For the use of the 3-em dash for repeated names, see 13.72–73.

denismaier avatar Sep 05 '25 08:09 denismaier

It would make a great difference to have this functionality. It is a standard alphabetization rule in indexing to ignore all definite and indefinite articles. Ideally this would extend beyond the standard a/an/the in English to cover articles in all languages in a multilingual context.

For bibliographies, this is explicit in Chicago, MHRA, MLA, and New Hart's Rules: I can dig out references if needed. I would be surprised to find any style specifying that titles should be sorted in bibliographies with an initial article; any that does is probably reflecting technical limitations rather than ideals. Quite often, it never comes up. For example, ISO 690:2021, sect. 7.2.4.1, has a list of elements to ignore across languages when sorting names, which implies that one should do something similar for titles, but I don't think that they address this.

I would propose adding a sort-with-article attribute on <key> elements. Locales would have the option to set this as true by default and to provide a list of articles to ignore.

adunning avatar Sep 05 '25 10:09 adunning