pypdf
pypdf copied to clipboard
Issue in text extraction (spaces)
Issue with text extraction (spacing)
Environment
Which environment were you using when you encountered the problem? windows 10
$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.7.0
Code + PDF
import PyPDF2;PyPDF2.PdfFileReader(open('c:/file-0.pdf', 'rb')).pages[3].extract_text()
result from text extraction (beginning only)
APPROVEDShortlyaftertheGenevaBOFsession,thewww-vrmlmailinglistwascreatedtodiscuss\nthedevelopmentofaspecificationforthefirstversionofVRML.Theresponsetothelist
other case (space dissaperaring???) import PyPDF2;PyPDF2.PdfFileReader(open('c:/2017.pdf', 'rb')).pages[0].extract_text()
observed on the footer( 2018 年04 月)