Hello,
I get this error when using Docling. I also added version and command line parameters.
Thank you in advance.
Bug
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "C:\Software\Docling\venv\Scripts\docling.exe_main.py", line 7, in
File "C:\Software\Docling\venv\Lib\site-packages\typer\main.py", line 338, in call
raise e
File "C:\Software\Docling\venv\Lib\site-packages\typer\main.py", line 321, in call
return get_command(self)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Software\Docling\venv\Lib\site-packages\click\core.py", line 1157, in call
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Software\Docling\venv\Lib\site-packages\typer\core.py", line 665, in main
return _main(
^^^^^^
File "C:\Software\Docling\venv\Lib\site-packages\typer\core.py", line 197, in _main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "C:\Software\Docling\venv\Lib\site-packages\click\core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Software\Docling\venv\Lib\site-packages\click\core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Software\Docling\venv\Lib\site-packages\typer\main.py", line 703, in wrapper
return callback(**use_params)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Software\Docling\venv\Lib\site-packages\docling\cli\main.py", line 389, in convert
export_documents(
File "C:\Software\Docling\venv\Lib\site-packages\docling\cli\main.py", line 112, in export_documents
conv_res.document.save_as_markdown(
File "C:\Software\Docling\venv\Lib\site-packages\docling_core\types\doc\document.py", line 1942, in save_as_markdown
fw.write(md_out)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.2288.0_x64__qbz5n2kfra8p0\Lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\u015b' in position 895: character maps to
Steps to reproduce
docling -v --to text --image-export-mode placeholder --ocr --ocr-lang it,en .\ERM.pdf
...
Docling version
2.11
...
@giuliastro Could you please provide us a sample PDF which causes this problem? We need one to investigate this problem.
Possibly related to https://github.com/DS4SD/docling/issues/598