pdf2docx
pdf2docx copied to clipboard
Open source Python library for converting PDF to DOCX.
Output images always are in PNG format, therefore any colorspace.n not in (1, 3) must be converted.
i am converting a pdf file to docx uing pdsf2docx cli command but it is bot preserving the Hyperlniks inside the pdf .Is there any thing i can i add...
### Description of the bug **Describe the bug** I am consistently receiving the warning `[WARNING] Ignore Line "" due to overlap` when using `pdf2docx` to convert PDF files to DOCX....
### Description of the bug some cells are being merged incorrectly is there anything wrong with Border.py or Cell.py ? pls find attachment below, left is PDF and docx is...
### Description of the bug The library with version 0.5.8 is not working as expected with Python 3.12. Please take a look into this. ### How to reproduce the bug...
I have no idea what values we can choose as dummy, but let's doing the job! ```bash File "pdf2docx/common/share.py", line 162, in return [int(s[i:i+2], 16) for i in [0, 2,...
### Description of the bug I am unable to convert the attached PDF to DOCX. [sample.pdf](https://github.com/user-attachments/files/18033880/sample.pdf) I am using the parse method to simply convert all pages. parse("sample.pdf", "sample.docx") Here...
### Description of the bug I'm encountering an error during parsing: "unsupported colorspace for '{output}'". My requirement is that I cannot modify the original PDF file, so I need to...
Add functionality to write the tables automatically to csv files
### Description of the bug 如图: ### How to reproduce the bug pdf附件如下: [英文表格.pdf](https://github.com/user-attachments/files/17813160/default.pdf) ### pdf2docx version 0.5.8 ### Operating system MacOS ### Python version 3.9