pdfminer
pdfminer copied to clipboard
use pdfminer to extract chapters from a book
I have a book(pdf format) maybe 3 chapters, I want to use pdfminer
(other tools is ok as long as the tool can do that) to parse the book, so I can extract every chapter from the book, and save them as chapter one.txt
、chapter two.txt
、chapter three.txt
.
How can I do that?
thanks.
I need too,Is the problem solved?
I need it too...
For now I found how to extract titles, easy with the get_outlines()
function
But I am currently thinking about how to now extract the text that is contained between two titles... Maybe by investigating in the code of that get_outlines() function?