pdf2word
pdf2word copied to clipboard
60行代码实现多线程PDF转Word
pdf中包含图片,转成word之后,在word文档中没有图片,请问怎么处理
```js λ python main.py 正在处理: 4月报销.pdf WARNING:root:UniGB-UCS2-H WARNING:pdfminer.converter:undefined: , 1050 WARNING:pdfminer.converter:undefined: , 2264 WARNING:pdfminer.converter:undefined: , 4409 WARNING:pdfminer.converter:undefined: , 4532 WARNING:pdfminer.converter:undefined: , 3493 WARNING:pdfminer.converter:undefined: , 1480 ``` 生成的doc 
ImportError: cannot import name 'process_pdf' from 'pdfminer.pdfinterp' (/anaconda3/lib/python3.7/site-packages/pdfminer/pdfinterp.py)
win系统好像用不了source就没写source下面的那几行代码,修改了下config就直接跑程序,就出现如题了
Cannot locate objid=3886。无法定位异常位置。
正在处理: Opportunities for using Navy marine mammals to explore associations between organochlorine contaminants and unfavorable effects on reproduction.pdf WARNING:root:Cannot locate objid=67 WARNING:root:Wrong type: 0 required: WARNING:root:Catalog not found!
无法导入process_pdf包,怎么解决
PDF 中包含中文,转换失败 PDFDocEncoding = ''.join( chr(x) for x in ( ValueError: chr() arg not in range(256)
我转的是一份具有三列分栏的PDF文档,转换后需要手动分段,不过大部分单词都是正确的。多谢啦Y(^_^)Y