citeproc-py
citeproc-py copied to clipboard
bibutils macros are not all supported by BibTex parser
bibutils is a commonly used converter, which can help citeproc-py users as it assists converting records to bib for importing into citeproc-py.
I've created an archive of bibutils releases at https://github.com/jayvdb/bibutils-archive , so it is easier to navigate and link to specifics.
bibutils latex macro/encoding mapping is at https://github.com/jayvdb/bibutils-archive/blob/master/lib/latex.c . It is only accessed by the functions latex2char
and uni2latex
.
It always emits the bib1
member of the struct for each unicode characer.
It would be good to ensure that all latex sequences it emits are accepted by citeproc-py's BibTex parser.
The following is a diff of supported macros after a quick cleanup and comparison, using only the macros matching ^[a-zA-Z]*$
(i.e. quickly excluding any of the more complex macros)
--- bibutils
+++citeproc
aa
AA
ae
AE
+b
+c
+copyright
+d
+dag
+ddag
+dh
+DH
dj
DJ
-emspace
-enspace
+G
+guillemotleft
+guillemotright
+guilsinglleft
+guilsinglright
+H
i
+k
l
L
-ldots
+ng
+NG
o
O
oe
OE
+P
+pounds
+quotedblbase
+quotesinglbase
+r
+S
ss
-textacutedbl
+t
+TeX
-textasciiacute
-textasciiacutex
textasciicircum
-textasciigrave
textasciitilde
-textbaht
-textbardbl
-textbrokenbar
+textasteriskcentered
+textbackslash
+textbar
+textbraceleft
+textbraceright
textbullet
-textcelcius
-textcent
-textcircledP
+textcircled
textcopyright
textdagger
textdaggerdbl
-textdegree
-textdiv
-textdong
-textdownarrow
-textestimated
-texteuro
+textdollar
+textellipsis
+textemdash
+textendash
textexclamdown
-textflorin
-textfractionsolidus
-textfrenchfranc
-textlangle
-textleftarrow
-textlira
-textlnot
-textlquill
-textmho
-textmu
-textnaira
-textnospace
-textnumero
-textohm
-textonehalf
-textonequarter
-textonesuperior
-textopenbullet
+textgreater
+textless
textordfeminine
textordmasculine
textparagraph
textperiodcentered
-textpertenthousand
-textpm
textquestiondown
-textrangle
+textquotedbl
+textquotedblleft
+textquotedblright
+textquoteleft
+textquoteright
textregistered
-textrightarrow
-textrquill
textsection
-textservicemark
textsterling
-textsurd
-texttenthousand
-textthreequarters
-textthreesuperior
-texttimes
texttrademark
-texttwosuperior
-textuparrow
+textunderscore
textvisiblespace
-textwon
-textyen
+th
+TH
-thinspace
+u
+U
+v
Are there any of these that are not suitable for citeproc-py's BibTeX parser?
If there is general support for keeping in sync with at least a subset of bibutils latex output, we could write a test that parses lib/latex.c
to ensure the defined subset is supported.
I had a quick look at the list, and I think most of them simply map to unicode characters.