python-office icon indicating copy to clipboard operation
python-office copied to clipboard

pdf转word提示字体错误

Open itmeicn opened this issue 1 year ago • 2 comments

[INFO] Start to convert C:\Users\zspym\Desktop\PyQt6 Python桌面开发.pdf [INFO] [1/4] Opening document... [INFO] [2/4] Analyzing document... Traceback (most recent call last): File "D:\python\tools\app\view\ocr_ofd_to_pdf_ui.py", line 33, in doTransfor popdf.pdf2docx(file_path=r''+self.fname,output_path=r''+newfile) File "D:\installed\Python311\Lib\site-packages\popdf\api\pdf.py", line 70, in pdf2docx mainPDF.pdf2docx(file_path, output_path) File "D:\installed\Python311\Lib\site-packages\popdf\core\PDFType.py", line 86, in pdf2docx cv.convert(word_path) File "D:\installed\Python311\Lib\site-packages\pdf2docx\converter.py", line 329, in convert self.parse(start, end, pages, **settings).make_docx(docx_filename, **settings) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\installed\Python311\Lib\site-packages\pdf2docx\converter.py", line 112, in parse return self.load_pages(start, end, pages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\installed\Python311\Lib\site-packages\pdf2docx\converter.py", line 153, in parse_document self._pages.parse(self.fitz_doc, **kwargs) File "D:\installed\Python311\Lib\site-packages\pdf2docx\page\Pages.py", line 37, in parse raw_page.restore(**settings) File "D:\installed\Python311\Lib\site-packages\pdf2docx\common\share.py", line 226, in inner objects = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "D:\installed\Python311\Lib\site-packages\pdf2docx\page\RawPage.py", line 67, in restore super().restore(raw_dict) File "D:\installed\Python311\Lib\site-packages\pdf2docx\layout\Layout.py", line 74, in restore self.blocks.restore(data.get('blocks', [])) File "D:\installed\Python311\Lib\site-packages\pdf2docx\layout\Blocks.py", line 98, in restore block = TextBlock(raw_block) ^^^^^^^^^^^^^^^^^^^^ File "D:\installed\Python311\Lib\site-packages\pdf2docx\text\TextBlock.py", line 49, in init self.lines = Lines(parent=self).restore(raw.get('lines', [])) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\installed\Python311\Lib\site-packages\pdf2docx\text\Lines.py", line 31, in restore line = Line(raw) ^^^^^^^^^ File "D:\installed\Python311\Lib\site-packages\pdf2docx\text\Line.py", line 54, in init self.spans = Spans(parent=self).restore(raw.get('spans', []))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\installed\Python311\Lib\site-packages\pdf2docx\text\Spans.py", line 19, in restore span = TextSpan(raw_span) ^^^^^^^^^^^^^^^^^^ File "D:\installed\Python311\Lib\site-packages\pdf2docx\text\TextSpan.py", line 78, in init self._change_font_and_update_bbox(constants.DEFAULT_FONT_NAME) File "D:\installed\Python311\Lib\site-packages\pdf2docx\text\TextSpan.py", line 121, in _change_font_and_update_bbox font = fitz.Font(font_name) ^^^^^^^^^^^^^^^^^^^^ File "D:\installed\Python311\Lib\site-packages\fitz\fitz.py", line 9404, in init _fitz.Font_swiginit(self, _fitz.new_Font(fontname, fontfile, fontbuffer, script, language, ordering, is_bold, is_italic, is_serif, embed)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: cannot find builtin font with name 'Arial'

itmeicn avatar Dec 28 '23 07:12 itmeicn

python 3.11 ,win11x64开发环境

itmeicn avatar Dec 28 '23 08:12 itmeicn

上游库 pdf2docx 的问题,已在0.5.7版本中修复。

pip install pdf2docx --upgrade

dothinking avatar Jan 07 '24 16:01 dothinking