Writing doesn't work for umlauts (likely UTF-8 formatting problem)
Describe the bug When I create a library with unicode characters (tested with umlauts), the export will not show them properly.
Reproducing
Version: Most current from PyPI
Code:
import bibtexparser
from bibtexparser import *
bib_library = bibtexparser.Library()
fields = []
fields.append(bibtexparser.model.Field("author", "ö"))
entry = bibtexparser.model.Entry("ARTICLE", "test", fields)
bib_library.add(entry)
print(bib_library.entries_dict)
bibtexparser.write_file("my_new_file.bib", bib_library)
Bibtex:
@ARTICLE{test,
author = {�}
}
Workaround No; possibly generating a string first and then writing that manually will work (similarly to the workaround for #394).
Remaining Questions (Optional) Please tick all that apply:
- [ ] I would be willing to to contribute a PR to fix this issue.
- [ ] This issue is a blocker, I'd be greatful for an early fix.
The proposed workaround seems to work for now. But it's a bit concerning to see that the library doesn't handle such simple cases of encoding well.
Have not manually reproduced, but a change similar to what has been done in #395 (to fix #394, parsing encoding) would be reasonable to be implemented also for the writer. In short, the user should be able to pass in an encoding.
Would you be willing to contribute a fix?
Note: Bibtex does not actually support non-ascii characters, and using the LatexEncodingMiddleware may fix large parts of this problem. However, newer replacements of bibtex support utf-8, thus the issue clearly still remains valid and to be implemented.
I started a PR in #405.
PR #405 is looking for someone to take over. Volunteers, come forward :rocket: ;-)