Error in latex_to_unicode

Open dlesbre opened this issue 1 year ago • 0 comments

Describe the bug The latex_to_unicode function can fail with a rather obsure type error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dorian/Programs/github/bibtex-autocomplete/venv/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 65, in latex_to_unicode
    string = _replace_all_latex(string, itertools.chain(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dorian/Programs/github/bibtex-autocomplete/venv/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 53, in _replace_all_latex
    string = _replace_latex(string, l.rstrip(), u)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dorian/Programs/github/bibtex-autocomplete/venv/lib/python3.11/site-packages/bibtexparser/latexenc.py", line 35, in _replace_latex
    if unicodedata.combining(unicod):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: combining() argument must be a unicode character, not str

The problem is most likely due to line like this one, where the encoding isn't a single unicode character: https://github.com/sciunto-org/python-bibtexparser/blob/e4c6eb656f26363eab91510f209a9d7e32942db9/bibtexparser/latexenc.py#L941-L945

(Although this isn't the only example)

Reproducing

Version: 1.4.1

Code:

from bibtexparser.latexenc import latex_to_unicode
latex_to_unicode("\\;")

Remaining Questions (Optional) Please tick all that apply:

[x] I would be willing to contribute a PR to fix this issue: my solution would be to put a try except block around the call to unicodedata.combining, assume false if it fails. I haven't submitted this directly because I don't know what these non-unicode characters are and why they are there. If their is a good reason there is probably a better way to handle them, if not they should probably be removed.
[ ] This issue is a blocker, I'd be grateful for an early fix.

Related issue: https://github.com/dlesbre/bibtex-autocomplete/issues/12

Feb 16 '24 22:02 dlesbre