biblatex icon indicating copy to clipboard operation
biblatex copied to clipboard

Different sorting schemes by Entry Types

Open matteofg opened this issue 1 year ago • 32 comments

Biblatex offers several sorting tools, but not a simple method to set different sorting templates by different Entry Types. Is there a technical reason why the sorting option is not settable on a per-type option? E.g. \ExecuteBibliographyOptions[article]{sorting=custom}.

Matteo

matteofg avatar Jul 04 '23 08:07 matteofg

The problem is that this results in incoherent sorted lists in general as the criteria isn't visible to the user and it's not obvious what it is. You can have different lists of types sorted differently, just not everything in the same list sorted in different ways depending on type.

plk avatar Jul 04 '23 16:07 plk

I imagined a similar response, but it leaves me a bit puzzled. When regular types are used the inconsistency is obvious, but as pointed out in some discussions on tex.stackexchange.com, the usefulness of this solution becomes clear in special cases where non-standard types are used.

The following is a use case:

  • definition of a non-standard driver (like @custom[a-f]) in which both date and eventdate fields are used;
  • use of a single bibliography (no multiple bibliographies);
  • need to substitute, for the non-standard driver only, the eventdate field instead of the date field in the sorting template.

I don't know if it is possible to get there with the "Dynamic Modification of Data" tools, but it would certainly be easy via a setting like\ExecuteBibliographyOptions[customa]{sorting=<custom>}.

Matteo

matteofg avatar Jul 05 '23 07:07 matteofg

So, you mean you want to use eventdate instead of date only for some types? Couldn't you just always do that or are there cases where some types would have an eventdate that you didn't want to use for sorting? And would that non-sorting eventdate be printed? If not, it could just be removed dynamically on a per-type basis. I'm trying to think of methods that don't mean a significant new feature for edge cases.

plk avatar Jul 05 '23 08:07 plk

No field should be dynamically deleted, nor sortkey field should be dynamically populated. The issue is only about using a different sorting scheme for a specific entry type.

In the specific case, the requirement is to replace, for @custom[a-f] only, the eventdate field instead of the date field in the sorting template used.

Thus, if for example sorting=nty, the sorting template for @custom[a-f] should be as follows:

\DeclareSortingTemplate{nty}{
  \sort{
    \field{presort}
  }
  \sort[final]{
    \field{sortkey}
  }
  \sort{
    \field{sortname}
    \field{author}
    \field{editor}
    \field{translator}
    \field{sorttitle}
    \field{title}
  }
  \sort{
    \field{sorttitle}
    \field{title}
  }
  \sort{
    \field{sortyear}
    \field{eventdate} % instead of \field{year}
  }
  \sort{
    \field{volume}
    \literal{0}
  }
}

Probably, in this specific case, the best solution would be to use a command that works like \DeclareSortExclusion and \DeclareSortInclusion, but involves a substitution instead of a exclusion/inclusion.

A command like\DeclareSortReplace{⟨entrytype, ...⟩}{(⟨origfield⟩)(⟨field⟩), ...} could be an interesting new feature.

Matteo

matteofg avatar Jul 05 '23 09:07 matteofg

You can already do that with the "sourcemapping" feature - replace/substitute fields during .bib parsing on a per-type basis. You could, for example, just copy eventdate into another field like date, only for certain entry types before sorting sees any data and this is all dynamic, nothing changes in the data file.

plk avatar Jul 05 '23 12:07 plk

I admit that I'm not intimately familiar with how "sourcemapping" works, but my use case (as written above) involves that the non-standard driver using both the eventdate and date fields (so it's not possible to copy/replace/substitute these fields).

In my opinion, the solutions:

  • \ExecuteBibliographyOptions[⟨entrytype⟩]{sorting=⟨template⟩} in case of using a custom template;
  • \DeclareSortReplace{⟨entrytype, ...⟩}{(⟨origfield⟩)(⟨field⟩), ...} in case of replacing some fields, regardless of the sorting template used;

are features that are currently lacking that I wouldn't know how to provide with current biblatex tools.

matteofg avatar Jul 06 '23 15:07 matteofg

If you can put a MWE here which demonstrated what you would like to do, I can suggest a solution with the current functionality.

plk avatar Jul 06 '23 16:07 plk

You can already do that with the "sourcemapping" feature - replace/substitute fields during .bib parsing on a per-type basis.

You are right, sorry. I lost sight of the bigger picture here. Effectively, it is enough to take advantage of the sortname, sorttitle and sortyear fields, which exist precisely for this purpose.

In my use case:

\DeclareSourcemap{
  \maps{
    \map{
      \pertype{⟨entrytype⟩}
      \step[fieldsource=eventdate, final]
      \step[fieldset=sortyear, origfieldval]
    }
  }
}

However, replacing the sortyear/year field with eventdate does not have the sorting I expected, probably because of the different data model of the fields (year=integer; eventdate=date). But this, of course, is about a different issue, which it would be fair to explore elsewhere.

Matteo PS: in conclusion, these biblatex tools, if used inappropriately, still allow you to create incoherent sorted lists, so... you could also accept the sorting option as a per-type option ;-)

matteofg avatar Jul 07 '23 10:07 matteofg

You can just copy eventdate to date as that will sort first and is of the same type. It's not recommended to use the legacy year field whenever possible.

You are right that sourcemapping can result in incoherent sorting but it can result in incoherent everything since it's a general dynamic data mapping facility. What I'd rather not do is introduce sorting options that can result in incoherent sorting ...

plk avatar Jul 07 '23 10:07 plk

You can just copy eventdate to date as that will sort first and is of the same type. It's not recommended to use the legacy year field whenever possible.

Sorry to insist, but evidently I'm struggling to explain myself, or I'm missing something trivial.... The following is a mwe:

% !TEX encoding = UTF-8 Unicode
% !TEX program = lualatex
% !BIB program = biber

\begin{filecontents}[overwrite]{\jobname.bib}
@customa{a,
  author = {Author},
  eventdate = {1000-01-01},
  title = {Title},
  journaltitle = {Journaltitle},
  date = {3000},
}
@customa{b,
  author = {Author},
  eventdate = {2000-01-01},
  title = {Title},
  journaltitle = {Journaltitle},
  date = {1000},
}
@customa{c,
  author = {Author},
  eventdate = {3000-01-01},
  title = {Title},
  journaltitle = {Journaltitle},
  date = {2000},
}
@article{article,
  author = {Author},
  title = {Title},
  journaltitle = {Journaltitle},
  date = {2000},
}
\end{filecontents}

\documentclass{article}
\usepackage[style=authortitle]{biblatex}
  \addbibresource{\jobname.bib}

\DeclareBibliographyDriver{customa}{%
  \usebibmacro{bibindex}%
  \usebibmacro{begentry}%
  \printnames{author}%
  \newunit
  \printeventdate
  \newunit
  \usebibmacro{title}%
  \newunit\newblock
  \bibstring{in}%
  \printunit{\intitlepunct}
  \usebibmacro{journal+issuetitle}%
  \usebibmacro{finentry}}

\begin{document}

\nocite{*}
\printbibliography
\end{document}

Without creating custom schemas (then using one of the existing sorting templates, like nty, anyvt, etc.), how can I use, only for customa type entries, the eventdate field instead of the date field?

matteofg avatar Jul 10 '23 13:07 matteofg

Add this to the preamble:

\DeclareSourcemap{
  \maps[datatype=bibtex]{
    \map{
      \pertype{customa}
      \step[fieldsource=eventdate, match=\regexp{\A(\d+)}]
      \step[fieldset=sortyear, fieldvalue={$1}]
    }
  }
}

This will copy the year from the eventdate into the sortyear field which is always looked at before year in all default sorting templates and it will do this only for customa entrytypes.

plk avatar Jul 11 '23 06:07 plk

Thanks for the help, but the proposed solution, as the regular expression is set up, only works with eventdate fields containing different years. Consider, for example: eventdate = {1000-01-01}; eventdate = {1000-06-01}; eventdate = {1000-12-01}.

Beyond the solution to the case (which I hope is at hand), I think the original error is in using, in the sorting templates, "deprecated" legacy fields (as year and sortyear) that are "integer" type fields instead of "date" type fields.

matteofg avatar Jul 11 '23 07:07 matteofg

Actually, year and sortyear are not legacy fields in sorting templates, they are deprecated only in the actual .bib data files. biber splits up ISO8601 date fields into the date components which are visible to sorting. Your issue above is a different problem as you want there to sort by month and day after year which isn't part of any default sorting scheme but can easily be done. You would just populate month and day from eventdate.

plk avatar Jul 11 '23 07:07 plk

You would just populate month and day from eventdate.

Does this mean that the only solution is to create a custom sorting template? If so, we return to the original question of this issue, which is: how to use different sorting schemes by different entry type?

matteofg avatar Jul 11 '23 08:07 matteofg

Yes, sorting at more granularity than year needs a custom sorting template but doesn't need to be per-type in this case - just add month and day below year and use the sourcemap.

plk avatar Jul 11 '23 08:07 plk

Okay, all quite clear, thank you.

This discussion, in my opinion, has highlighted the limitations of standard sorting schemes.

Beyond my specific case (use of the eventdate field), I think it's not so unusual to use the date field in the form YYYY-MM-DD. The presence in standard sorting templates of the year and sortyear fields does not allow the granularity that would be desirable.

Subject to the need to safeguard backward compatibility I think it is desirable to update these schemes, using the date and sortdate fields (the latter not existing).

matteofg avatar Jul 11 '23 09:07 matteofg

The *date fields don't exist outside of the data files as they are parsed into their components (e.g year and sortyear). We have thought about sorting by month/day after year but it's a relatively niche requirement as the vast majority of sorting is year only, differentiated by extrayear in almost every style.

plk avatar Jul 11 '23 09:07 plk

Thank you for these clarifications. Ultimately, in the specific case, how should the custom sorting template be defined?

matteofg avatar Jul 11 '23 10:07 matteofg

Put the template below in your preamble and set the package sort option: sorting=ntymd

\DeclareSortingTemplate{ntymd}{
  \sort{
    \field{presort}
  }
  \sort[final]{
    \field{sortkey}
  }
  \sort{
    \field{sortname}
    \field{author}
    \field{editor}
    \field{translator}
    \field{sorttitle}
    \field{title}
  }
  \sort{
    \field{sorttitle}
    \field{title}
  }
  \sort{
    \field{sortyear}
    \field{year}
    \field{month}
    \field{day}
  }
  \sort{
    \field{volume}
    \literal{0}
  }
}

plk avatar Jul 11 '23 14:07 plk

Thank you for everything.

We have thought about sorting by month/day after year but it's a relatively niche requirement as the vast majority of sorting is year only

I suggest to modify the standard templates, adding the month and day fields for all schemes. Although this is a "relatively niche requirement," I think there is no downside to it: greater granularity in sorting templates; full backward compatibility.

See you soon ;-) Matteo

matteofg avatar Jul 12 '23 06:07 matteofg

Thank you for everything.

We have thought about sorting by month/day after year but it's a relatively niche requirement as the vast majority of sorting is year only

I suggest to modify the standard templates, adding the month and day fields for all schemes. Although this is a "relatively niche requirement," I think there is no downside to it: greater granularity in sorting templates; full backward compatibility.

See you soon ;-) Matteo

Dear maintainers,

I agree with @matteo339, I'm currently using biblatex to display my publication and invited talk list in my CV, and I need references to be sorted by full date in decreasing order. I guess this use-case may concern a significative amount of users : most specifically academics willing to use latex for their CV.

While I managed to find some documents on internet allowing me to define this sorting strategy, I required me few hours to get the things done, and I believe some users may appreciate having this feature implemented in the main distribution.

You'll find the corresponding code bellow, if you think it may help, I may propose to contribute to your wonderfull project, and submit a pull request associated to a clean integration of this feature.

\DeclareSortingTemplate{ymdDnt}{
  \sort{
    \field{presort}
  }
  \sort[final]{
    \field{sortkey}
  }
  \sort[direction=descending]{
    \field{sortyear}
    \field{year}
    \literal{9999}
  }
  \sort[direction=descending]{
    \field[padside=left,padwidth=2,padchar=0]{month}
    \literal{99}
  }
  \sort[direction=descending]{
    \field[padside=left,padwidth=2,padchar=0]{day}
    \literal{99}
  }
  \sort{
    \field{sortname}
    \field{author}
    \field{sorttitle}
    \field{title}
  }
  \sort{
    \field{sorttitle}
  }
  \sort[direction=descending]{
    \field[padside=left,padwidth=4,padchar=0]{volume}
    \literal{9999}
  }
}

DavidDoukhan avatar Jul 20 '23 12:07 DavidDoukhan

@moewew - What's your view on this? It would be possible to add add month/day to all default sorting templates with little impact but it could change the sort order for existing documents that have that information available. We would not change the name of any of the default templates, of course. Given the increasing propensity to cite online sources that have more granular dates, I do see the point in this.

plk avatar Jul 31 '23 11:07 plk

As you probably know I'm generally not too keen on changes to core definitions like this.

Strictly speaking this would be backwards incompatible and could change existing documents. In practice it probably wouldn't happen all that often (and people might actually prefer the more granular date sorting given that they made the effort to input a more granular date). But it is pretty simple to define a sorting template with more granular sorting for users and the whole point of all of this has always been that users can and should define their own templates if they are not happy with the standard output.

moewew avatar Jul 31 '23 17:07 moewew

I think there is a case to be made here that granular date sorting should be the default as it is what people expect when they have month/day in there but it would definitely change quite a lot of documents I think so we can't just change the defaults that easily. How about we just add new templates with d instead of y (for date)?

plk avatar Jul 31 '23 18:07 plk

... people might actually prefer the more granular date sorting given that they made the effort to input a more granular date

I think there is a case to be made here that granular date sorting should be the default as it is what people expect when they have month/day in there but it would definitely change quite a lot of documents I think so we can't just change the defaults that easily

I think that the possible modification of existing documents is a false problem. I don't think a document author would care about a sorting scheme a) by year b) by citation order. This is a consequence of the granularity limits of existing sorting templates. If the .bib file contains the date field, I don't think anyone is going to complain that their document might change, quite the contrary....

matteofg avatar Aug 01 '23 07:08 matteofg

Well, the general principle is that we don't want to change existing documents to the extent that we can do that. Adding month/day would change the sort order of some documents since they next sorting template elements is usually something like name or title. It would potentially be annoying to users who had to redefine sorting templates to get the old behaviour back as I can imagine that auto-exported data contained month/day in cases where certain styles really only care about year. How about a new package option which enables more granular date sorting? That way, it's easy to enable/disable and we keep the default the way it is? @moewew - thoughts?

plk avatar Aug 01 '23 08:08 plk

The point about auto-exported data is a good one. Especially if people use the "ol' \clearfield{month} trick" (as opposed to sourcemapping - which may or may not be a bit tricky with dates, I admit), data that appears nowhere in the document output could influence sorting.

If we want to have more granular sorting in the core, new templates with d instead of y seem the most straightforward way. It would mean more code and schemes to maintain, but its not a lot more complicated than what we have at the moment. I know that users love options, but this package option sounds a bit ad-hoc if it were to just switch sort schemes "in the backgroud". Or would the idea be that it would essentially tell Biber to treat ...year in sorting templates as ...year-...month-...day (possibly down to time?).

moewew avatar Aug 01 '23 09:08 moewew

I would want to make the sorting determinable from the visible information in the sorting templates, not invisible in an option that biber used in the background as that just makes sorting less transparent and not determinable from readable templates. I was thinking of just something like adding some post/prefix to granular templates and using them determined by a package option. However, you have opened a can of worms with time ... if we are going to do this properly, we might as well allow sorting all the way down an ISO8601 date. I can see that being useful for forensic bibliography lists ....

plk avatar Aug 01 '23 09:08 plk

My worry is that package options work well when they give you a choice of a small number of fixed ... well ... options, but not so great, when there is a large number of possibilities to choose from. I guess if we had something a boolean option called datesorting and code in the sorting template that would activate or deactivate according to that option that might work. It would allow us to have two defaults year-only or "full date" sort (where we would have to decide what "full date" means - see the time thingy). But if we want to have the option determine more (how granular the sorting becomes), things might get messy,

moewew avatar Aug 01 '23 09:08 moewew

I was thinking of an "all or nothing" option - it's either just year, or it's everything. I can't see any real use-case for anything in between.

plk avatar Aug 01 '23 09:08 plk