pdfparser icon indicating copy to clipboard operation
pdfparser copied to clipboard

How to remove footer and sidebar when parsing text?

Open nitikachoudhary16 opened this issue 4 years ago • 2 comments

How to remove footer and sidebar in pdf parser to text?

nitikachoudhary16 avatar Aug 13 '21 23:08 nitikachoudhary16

@nitikachoudhary16 Since there are no fixed sidebar and footer areas in a PDF, I doubt if this can be done. However, if there are specific patterns in which the PDFs are built, those patterns can be used to identify footer and sidebar

qwertynik avatar Aug 16 '21 15:08 qwertynik

I have observed space gap is coming in some words for example lets take a word "community" it is coming as comm unity.

prakashmvss avatar Aug 17 '21 01:08 prakashmvss