python-bibtexparser
python-bibtexparser copied to clipboard
Dump BibDatabase with Unicode strings to string with LaTeX accents
This is a cross post from Stack Overflow.
I can us BibtexParser to parse a BibTeX file with special characters encoded as LaTeX into a Unicode BibDatabase as follows
import bibtexparser
from bibtexparser.bparser import BibTexParser
from bibtexparser.customization import convert_to_unicode
with open('bibtex.bib') as bibtex_file:
parser = BibTexParser()
parser.customization = convert_to_unicode
bib_database = bibtexparser.load(bibtex_file, parser=parser)
But what if I have a Unicode BibDatabase like bib_database
that I want to write to a string where the Unicode characters have been converted to their LaTeX encodings?
I see I can dump the database to a string as follows
bibtex_str = bibtexparser.dumps(bib_database)
but the characters in bibtex_str
are still Unicode. dumps has an optional writer
parameter, but the documentation doesn't discuss if/how this can be used to control the special character encoding of the output string.
Any help would be much appreciated!
Hey there. I just wanted to bump this thread. Since I've received no response here or on Stack Overflow, I presume this functionality is not implemented. I'm willing to make a pull request if you could give me some guidance on how you would want this implemented. My intuition is to integrate it into BibTexWriter
in a similar fashion to BibTexParser
. For example:
import bibtexparser
from bibtexparser.bwriter import BibTexWriter
from bibtexparser.customization import convert_to_latex
bib_database = ...
writer = BibTexParser()
writer.customization = convert_to_latex
bibtex_str = bibtexparser.dumps(bib_database, writer=writer)
This would require implementing convert_to_latex
, adding the customization
attribute to BibTexWriter
, and calling customization
when writing the entries. My guess is the latter would happen in BibTexWriter
's _entry_to_bibtex
method, though there seems to be other _*_to_bibtex
methods that may need to be tweaked as well.
Any feedback on this would be much appreciated. Thanks!
This is possible in v2
with the LatexEncodingMiddleware.
Thanks @MiWeiss. Believe it or not this issue is still relevant for me so I appreciate you following up.
Hi @alancleary
That's great to hear, thanks for your comment. Note that v2
, which contains this fix, is still only beta (see the main branch readme) and has some missing features, but I'd encourage you to give it a try and tell me if it fits your need.