rootstock icon indicating copy to clipboard operation
rootstock copied to clipboard

Potential feature: enriching the citation to include quoted text / data

Open samuelorion opened this issue 4 years ago • 3 comments

Manubot has some great features that really help to improve reading and writing scientific communications. Particularly, taking advantages of modern web infrastructure:

... articles should be interactive. Hovering over a citation should bring up the full reference and highlight other instances of the same citation ... (manufesto)

This has been expanded to include the possibility to augment the context of the citation:

Authors should have the ability to specify why they cite a particular work, preferably assigning a citation type from a standardized terminology [10]. (manufesto)

This is addressed in issue #420

I would like to see the ability to deepen the format of our citation practices:

Not only should authors be able to "annotate citations with context", but we should be able to show what precisely – from our source – we are citing as evidence for our claim (if pertinent).

ie. When one hovers over a citation, and the pop up for the full citation appears :

If the citation is, in fact, referring to a particular passage of text, or a figure, that text or image appears within the pop up box.

This could loosely be modelled on the two following things:

Zotero allows one to collect annotations from PDFs and rendered as markdown.

pdf reader preview [Zotero Documentation]

And Andy Matuschak has something similar, that could be an interesting way to include further materials that is being cited from your source.

https://user-images.githubusercontent.com/48258997/118158765-394d0480-b3ea-11eb-9a8b-fdc79f403c6b.mp4

samuelorion avatar May 13 '21 17:05 samuelorion

This feature idea does seem related to the Citation Typing Ontology annotations in #420 but takes the concept even further by providing the relevant part of the source material.

I expect the front end interactive HTML would be feasible, even if it is challenging. What would the citation syntax look like? Citation context is per-citation, not per-reference (that is, it could be different each time the reference is cited). Referring to your example above, when using citation-by-identifier to cite the manufesto URL would you want to write something like

Citation context is important in academic writing [@https://manubot.github.io/manufesto/ "However, not all citations are equal: some dispute the cited work, some affirm it, most do neither."].

That hypothetical syntax adds arbitrary text to the citation but doesn't tie it to the source. I don't have ideas for how to link the citation to that particular location in the source HTML, PDF, etc.

agitter avatar May 14 '21 02:05 agitter

I don't have ideas for how to link the citation to that particular location in the source HTML, PDF, etc

This would be tough, and maybe unnecessary.

How this would be mixed with CiTO I am not sure. Perhaps for something where you would want to do this, you would do the following: [@PMID:32341891 recommended_reading "arbitray text" or url to image]

Citation context is per-citation, not per-reference (that is, it could be different each time the reference is cited).

Could the arbitrary text / image be stored in a similar fashion as the CSL-JSON, just that it is almost like a different entry when you are writing in markdown.

Instead of doing this :

Citation context is important in academic writing [@https://manubot.github.io/manufesto/ "However, not all citations are equal: some dispute the cited work, some affirm it, most do neither."].

You would write something like this :

Citation context is important in academic writing [@https://manubot.github.io/manufesto/][source: "However, not all citations are equal: some dispute the cited work, some affirm it, most do neither."].

And it would render something like "Citation context is important in academic writing (1)", where the mouseover would display the reference info, and the source info.


I know that this kind of feature may not seem that relevant, but I give the following example to maybe illustrate what I think it might help avoid.

Data-driven classification of the certainty of scholarly assertions [PeerJ] . Though they address mainly the concept of how scientists write about claims of truth ([@PMID:32341891 "The grammatical structures scholars use to express their assertions are intended to convey various degrees of certainty or speculation."]), they have this really nice example of poor citing going wild:

This figure shows how poor citation practices lead to a loose observational statement evolving into a fully-fledged "fact":

[@https://peerj.com/articles/8871/#fig-1 "These statements represent a series of scholarly assertions about the same biological phenomenon, revealing that the core assertion transforms from a hedging statement into statements resembling fact through several steps, but without additional evidence.]

If when making a claim, a reader could quickly see what in the source the author is citing, you can make a better decision in whether you continue to accept the construction of the argument.

samuelorion avatar May 14 '21 19:05 samuelorion

Thanks @samuelorion for the suggestion and your careful reading of the Manufesto!

On a technical level, I think what we need is a way to provide additional metadata about a citation, whether it's a CiTO term, a locator like page number, or a note on why the work is being cited.

This might be achievable with our current infrastructure. From the pandoc usage guide:

Each citation must have a key, composed of ‘@’ + the citation identifier from the database, and may optionally have a prefix, a locator, and a suffix. ... pandoc will use heuristics to distinguish the locator from the suffix. In complex cases, the locator can be enclosed in curly braces:

[@smith{ii, A, D-Z}, with a suffix]
[@smith, {pp. iv, vi-xi, (xv)-(xvii)} with suffix here]

I'd like to get a more thorough description of prefix, locator, and suffix.

If the citation is, in fact, referring to a particular passage of text, or a figure, that text or image appears within the pop up box.

This sounds like a locator to me. Other notes probably, like why the work is being cited, probably belong in the suffix.

So the next question is how to get the prefix, locator, and suffix to get rendered somehow in the HTML. We could then make it so they were visible in the citation tooltips. I am not sure whether the CSL XML style has control over prefix, locator, and suffix or whether this is a pandoc specific feature.

Will post any findings here:

dhimmel avatar May 18 '21 01:05 dhimmel