pdfminer icon indicating copy to clipboard operation
pdfminer copied to clipboard

use pdfminer to extract chapters from a book

Open th0o0 opened this issue 6 years ago • 2 comments

I have a book(pdf format) maybe 3 chapters, I want to use pdfminer(other tools is ok as long as the tool can do that) to parse the book, so I can extract every chapter from the book, and save them as chapter one.txtchapter two.txtchapter three.txt.

How can I do that?

thanks.

th0o0 avatar Oct 27 '18 12:10 th0o0

I need too,Is the problem solved?

stud2008 avatar Jan 08 '20 08:01 stud2008

I need it too... For now I found how to extract titles, easy with the get_outlines() function But I am currently thinking about how to now extract the text that is contained between two titles... Maybe by investigating in the code of that get_outlines() function?

hcharp avatar Mar 18 '22 15:03 hcharp