citar icon indicating copy to clipboard operation
citar copied to clipboard

Discrepancies in citation format when using `citar-insert-reference`

Open benthamite opened this issue 2 years ago • 12 comments

Apologies if I'm missing something obvious, but I notice that when I insert a formatted reference with citar-insert-reference, there are various discrepancies between the inserted reference and the same reference as it appears when exported with one of the org-mode export commands, such as org-md-export-to-markdown. Most notably, the titles are not capitalized correctly (e.g. the braces surrounding a word are not respected).

As an example, consider the following bibtex entry:

@online{Hanson2023CanHumansBe,
  abstract =	 {It is one of the most fundamental questions in the
                  social and human sciences: how culturally plastic are
                  people? Many anthropologists have long championed the
                  view that humans are very plastic; with matching
                  upbringing people can be made to behave a very wide
                  range of ways, and to want a very wide range of
                  things. Others say human nature is far more
                  constrained, and collect descriptions of "human
                  universals" (See Brown's 1991},
  author =	 {Hanson, Robin},
  langid =	 {english},
  timestamp =	 {2023-06-14 15:12:51 (GMT)},
  title =	 {Can humans be the {FORTRAN} of creatures?},
  url =
                  {https://www.overcomingbias.com/p/how-plastic-are-peoplehtml},
  urldate =	 {2023-06-14},
}

Inserting this reference by invoking citar-insert-reference results in

[1]R. Hanson, “Can humans be the fortran of creatures?” https://www.overcomingbias.com/p/how-plastic-are-peoplehtml (accessed Jun. 14, 2023).

Whereas exporting a file that cites that work via org-md-export-to-markdown will show it in the "bibliography" section as

R. Hanson, “Can humans be the FORTRAN of creatures?” https://www.overcomingbias.com/p/how-plastic-are-peoplehtml (accessed Jun. 14, 2023).

I have used the IEEE csl citation style in this case, but the issue occurs with all the styles I tried.

benthamite avatar Jun 14 '23 15:06 benthamite

Are you using the default formatter, or the citeproc-el one?

I realize it's not documented in the README (PR welcome), but I suspect that's it; either you aren't using the citeproc formatter, or you're using a different style for that?

bdarcus avatar Jun 14 '23 15:06 bdarcus

Thanks for the quick reply.

In my config, the values of citar-citeproc-csl-styles-dir and citar-citeproc-csl-locales-dir are set to org-cite-csl-styles-dir and org-cite-csl-locales-dir, respectively, and citar-format-reference-function is set to citar-citeproc-format-reference. Finally, citar-citeproc-select-csl-style is set to ieee.csl, which is a file that exists in citar-citeproc-csl-styles-dir. Is there anything else that needs to be done for citar-citeproc.el to work properly?

In case it helps understand what might be going on, I interned citar-citeproc-format-reference and copied the output of each step in the evaluation to the attached file.

debugger-output.txt

benthamite avatar Jun 14 '23 16:06 benthamite

OK.

To go back to this:

Most notably, the titles are not capitalized correctly (e.g. the braces surrounding a word are not respected.

Here we're using the citar cache, which we populated using parsebib-parse with the display option, rather than parsing the bib on its own. That strips intra-field markup.

Obviously that enhances responsiveness, at the expense of some correctness.

Not sure if there's an easy way to resolve that, or if we could make it configurable.

bdarcus avatar Jun 14 '23 18:06 bdarcus

@benthamite can you confirm my hunch in my last reply?

bdarcus avatar Aug 24 '23 13:08 bdarcus

Apologies, I hadn't seen your previous message. I should be able to look into this within the next couple of days.

benthamite avatar Aug 25 '23 22:08 benthamite

Hi @bdarcus,

For testing purposes, I created bibliography.bib:

@online{Hanson2023CanHumansBe,
  abstract =	 {It is one of the most fundamental questions in the
                  social and human sciences: how culturally plastic are
                  people? Many anthropologists have long championed the
                  view that humans are very plastic; with matching
                  upbringing people can be made to behave a very wide
                  range of ways, and to want a very wide range of
                  things. Others say human nature is far more
                  constrained, and collect descriptions of "human
                  universals" (See Brown's 1991},
  author =	 {Hanson, Robin},
  langid =	 {english},
  timestamp =	 {2023-06-14 15:12:51 (GMT)},
  title =	 {Can humans be the {FORTRAN} of creatures?},
  url =
                  {https://www.overcomingbias.com/p/how-plastic-are-peoplehtml},
  urldate =	 {2023-06-14},
}

and config.el:

(setq org-cite-global-bibliography '("bibliography.bib"))
(setq org-cite-export-processors
      '((t . (csl "ieee.csl"))))
(setq citar-bibliography '("bibliography.bib"))

After evaluating the latter, I evaluate (citar-citeproc--itemgetter '("Hanson2023CanHumansBe")), which returns

(("Hanson2023CanHumansBe" (URL . "https://www.overcomingbias.com/p/how-plastic-are-peoplehtml") (title . "Can humans be the fortran of creatures?") (blt-type . "online") (type . "webpage") (language . "en-US") (abstract . "It is one of the most fundamental questions in the social and human sciences: how culturally plastic are people? Many anthropologists have long championed the view that humans are very plastic; with matching upbringing people can be made to behave a very wide range of ways, and to want a very wide range of things. Others say human nature is far more constrained, and collect descriptions of \"human universals\" (See Brown’s 1991") (author ((family . "Hanson") (given . "Robin"))) (accessed (date-parts (2023 6 14)))))

By contrast, if I create document.org

[cite:@Hanson2023CanHumansBe]

#+print_bibliography:

and run org-md-export-to-markdown, I get

<a href="#citeproc_bib_item_1">[1]</a>  

<style>.csl-left-margin{float: left; padding-right: 0em;}
 .csl-right-inline{margin: 0 0 0 1em;}</style><div class="csl-bib-body">
  <div class="csl-entry"><a id="citeproc_bib_item_1"></a>
    <div class="csl-left-margin">[1]</div><div class="csl-right-inline">R. Hanson, “Can humans be the FORTRAN of creatures?” <a href="https://www.overcomingbias.com/p/how-plastic-are-peoplehtml">https://www.overcomingbias.com/p/how-plastic-are-peoplehtml</a> (accessed Jun. 14, 2023).</div>
  </div>
</div>

As you can see, the word "FORTRAN" is in all caps in the exported Markdown, but not in the output of (citar-citeproc--itemgetter '("Hanson2023CanHumansBe")).

I'm not entirely sure this is the kind of test you wanted me to run. Please let me know if there's anything else I should do. I'm attaching the relevant files in case it helps you reproduce the issue. files.zip

benthamite avatar Aug 29 '23 14:08 benthamite

Thanks.

I'm almost certain my assumption is correct; that using our cache for the formatting means the TeX markup gets stripped before citeproc sees it.

Still not sure what we can, or should, do about that.

bdarcus avatar Aug 29 '23 14:08 bdarcus

I wanted to bump this, as I just stumbled upon the same issue.

leinfink avatar Jul 09 '24 13:07 leinfink

Maybe a quick fix would just be a manual (prefix) toggle whether to use the cache or not?

leinfink avatar Jul 09 '24 13:07 leinfink

Oh, nevermind, I think that was just an issue with a CSL style. Am I right in assuming that this got fixed, actually?

leinfink avatar Jul 09 '24 13:07 leinfink

Nothing changed on this end.

And I should clarify: it's not per se the cache, but rather that we use parsebib-parse with the display option to populate it.

We would do the same without a cache, and the alternative would be us parsing the input.

bdarcus avatar Jul 09 '24 13:07 bdarcus