python-bibtexparser icon indicating copy to clipboard operation
python-bibtexparser copied to clipboard

Dump BibDatabase with Unicode strings to string with LaTeX accents

Open alancleary opened this issue 5 years ago • 1 comments

This is a cross post from Stack Overflow.

I can us BibtexParser to parse a BibTeX file with special characters encoded as LaTeX into a Unicode BibDatabase as follows

import bibtexparser
from bibtexparser.bparser import BibTexParser
from bibtexparser.customization import convert_to_unicode

with open('bibtex.bib') as bibtex_file:
    parser = BibTexParser()
    parser.customization = convert_to_unicode
    bib_database = bibtexparser.load(bibtex_file, parser=parser)

But what if I have a Unicode BibDatabase like bib_database that I want to write to a string where the Unicode characters have been converted to their LaTeX encodings?

I see I can dump the database to a string as follows

bibtex_str = bibtexparser.dumps(bib_database)

but the characters in bibtex_str are still Unicode. dumps has an optional writer parameter, but the documentation doesn't discuss if/how this can be used to control the special character encoding of the output string.

Any help would be much appreciated!

alancleary avatar Oct 08 '19 14:10 alancleary

Hey there. I just wanted to bump this thread. Since I've received no response here or on Stack Overflow, I presume this functionality is not implemented. I'm willing to make a pull request if you could give me some guidance on how you would want this implemented. My intuition is to integrate it into BibTexWriter in a similar fashion to BibTexParser. For example:

import bibtexparser
from bibtexparser.bwriter import BibTexWriter
from bibtexparser.customization import convert_to_latex

bib_database = ...

writer = BibTexParser()
writer.customization = convert_to_latex
bibtex_str = bibtexparser.dumps(bib_database, writer=writer)

This would require implementing convert_to_latex, adding the customization attribute to BibTexWriter, and calling customization when writing the entries. My guess is the latter would happen in BibTexWriter's _entry_to_bibtex method, though there seems to be other _*_to_bibtex methods that may need to be tweaked as well.

Any feedback on this would be much appreciated. Thanks!

alancleary avatar Nov 05 '19 16:11 alancleary

This is possible in v2 with the LatexEncodingMiddleware.

MiWeiss avatar May 26 '23 14:05 MiWeiss

Thanks @MiWeiss. Believe it or not this issue is still relevant for me so I appreciate you following up.

alancleary avatar May 30 '23 15:05 alancleary

Hi @alancleary

That's great to hear, thanks for your comment. Note that v2, which contains this fix, is still only beta (see the main branch readme) and has some missing features, but I'd encourage you to give it a try and tell me if it fits your need.

MiWeiss avatar May 30 '23 15:05 MiWeiss