JSON reference data containing square brackets mistakenly produces Mathjax?
First of all, thanks for manubot, am looking forward to seeing it develop!
After following the setup instructions and getting my first manubot paper up and running --- this went very smoothly --- I encountered an issue where some citations containing square brackets are rendered as Mathjax (I think).
My reference data is in the manual-references.json file and I am citing using standard pandoc citations. I am using an APA author-year style CSL file. The JSON used is the output from the pandoc-citeproc command (converted from bibtex).
The reference data that leads to the odd-looking citation is...
{ "id": "nhknenkan43",
"issued": {
"date-parts": [
[
1943
]
]
},
"publisher": "NHK",
"publisher-place": "Tokyo",
"title": "Shōwa 18-nen rajio nenkan \\[1943 radio yearbook\\]",
"type": "book"
}
As you can see, this work has no author so the default seems to be to use the 'title' field instead, this means the in-text citation ends up rendered like this...

Maybe the \\ is being interpreted somewhere as a hard line break? Anyway, the produced HTML for this looks like this...

I realise that the main constituency for manubot is probably not social science people like me, but you never know !? I should mention that I use pandoc, and the same bib data, to generate my documents locally and do not see this.
@scokobro thanks for reporting this and trying Manubot. We are interested in supporting the social sciences.
Do you have this example in a public manuscript GitHub repository? I'd like to try to reproduce it locally.
Hi @gitter - thanks for getting back on this. The repository is - https://github.com/scokobro/RT-manubot
If it's of any use, this is the original bibtex, before conversion to JSON...
@book{nhknenkan43,
Address = {Tokyo},
Booktitle = {NHK Radio Yearbook 1943},
Editor = {{NHK}},
Publisher = {NHK},
Short = {Nenkan 1943},
Title = {Shōwa 18-nen Rajio Nenkan [1943 Radio Yearbook]},
Year = {1943}}
I have a couple ideas.
- Disable MathJax for your manuscript. This is easy but only a viable option if you do not need MathJax elsewhere in your manuscript. In my local testing, I edited
build/pandoc/defaults/html.yaml
# Pandoc --defaults for HTML output.
# Load on top of common defaults.
to: html5
output-file: output/manuscript.html
include-after-body:
- build/themes/default.html
- build/plugins/anchors.html
- build/plugins/accordion.html
- build/plugins/tooltips.html
- build/plugins/jump-to-first.html
- build/plugins/link-highlight.html
- build/plugins/table-of-contents.html
- build/plugins/lightbox.html
- build/plugins/attributes.html
#- build/plugins/math.html
- build/plugins/hypothesis.html
- build/plugins/analytics.html
#variables:
# math: ''
#html-math-method:
# method: mathjax
and build/pandoc/defaults/pdf-weasyprint.yaml (may not be necessary):
# Pandoc --defaults for PDF output via weasyprint.
# Load on top of HTML defaults.
output-file: output/manuscript.pdf
pdf-engine: weasyprint
pdf-engine-opts:
- '--presentational-hints'
#html-math-method:
# method: webtex
# url: 'https://latex.codecogs.com/svg.latex?'
Commenting out these HTML math related lines give me the following in the HTML version of a local build:

It's not perfect because the escaped brackets are still there. However, it's unlikely to break anything else.
- Edit the output HTML in the build script. After these lines in
build/build.shhttps://github.com/manubot/rootstock/blob/0964fd7ed7fdc1e3b468b5f4f1234d24de0b14fc/build/build.sh#L30-L36
add the following sed commands to convert \[ to [:
sed -i 's/\\\[/[/g' output/manuscript.html
sed -i 's/\\\]/]/g' output/manuscript.html
This looks nice in my local build:
and

However, there is a risk that this blunt replacement could break something else. You would likely be able to detect this in the HTML. I think it will fix the PDF version too when the PDF manuscript is built with Docker, as it is on GitHub. I don't think the WeasyPrint version would be fixed.
- Stripping the
\\[from the converted JSON. This does not seem to work. I tested the following with the commandmanubot cite --render url:test --bibliography=test2.json --format=markdownbut Pandoc adds back the\[when writing to markdown.:
[
{
"editor": [
{
"literal": "NHK"
}
],
"id": "url:test",
"issued": {
"date-parts": [
[
1943
]
]
},
"publisher": "NHK",
"publisher-place": "Tokyo",
"title": "Shōwa 18-nen rajio nenkan [1943 radio yearbook]",
"type": "book"
}
]
The following seems like the correct title for the CSL JSON:
"title": "Shōwa 18-nen rajio nenkan [1943 radio yearbook]",
So is the underlying issue that pandoc is prefixing brackets with blackslashes in titles, causing mathjax to incorrectly identify equations that should just be text. @jgm any idea whether this is a bug we should report to pandoc or pandoc-citeproc, or whether there is a solution on the user's end that is not disabling mathjax entirely?
@scokobro, a dirty but quick solution might just be renaming the title on your end to "Shōwa 18-nen rajio nenkan: 1943 radio yearbook". But we'd love to get this fixed in the longterm.
Well, I'm happy to report that (a) pandoc is transitioning to built-in citeproc support, so that pandoc-citeproc will no longer be needed; (b) the dev version of pandoc produces this with pandoc -f bibtex -t csljson:
[
{
"editor": [
{
"literal": "NHK"
}
],
"id": "nhknenkan43",
"issued": {
"date-parts": [
[
1943
]
]
},
"publisher": "NHK",
"publisher-place": "Tokyo",
"title": "Shōwa 18-nen rajio nenkan [1943 radio yearbook]",
"title-short": "[CSL STYLE ERROR: reference with no printed form.]",
"type": "book"
}
]
(There's clearly an issue here involving title-short, which I'll have to look at, but at least the brackets are no longer escaped.) You can get a nightly from the jgm/pandoc repository (under Actions) if you want to test further.
Thanks to everyone for your advice: The Chicago Manual of Style requires citations as footnotes with short titles, and that translations of titles of non-english (or 'other language') material appears in square-brackets. I am relieved to say that for me this is a passing issue (as I generally wouldn't write for a history journal) but it will be important for anyone using translated reference materials and required to use CMS styles. In the meantime I'm happy to disable mathjax. Thanks again, and good luck with everything!