pdf-reader icon indicating copy to clipboard operation
pdf-reader copied to clipboard

Ignore pdf footer while reading

Open bhagyashriSawkar opened this issue 5 years ago • 5 comments

Hi,

I was earlier using version 2.0.0 where the footer text was automatically getting ignored. I recently upgraded to version 2.4.0. Now while reading the PDF I'm getting the footer text as well. I could not find any mention about it in the documentation.

Is there any way flag that can be used to ignore the footer text?

bhagyashriSawkar avatar Jan 13 '20 11:01 bhagyashriSawkar

Thanks for reporting this issue.

pdf-reader has never intentionally skipped content on a page, and nothing between 2.0.0 and 2.4.0 has changed that.

I guess it's possible one of the bug fixes in those versions means some text that was accidentally skipped is now being extracted?

Are you able to test the intermediate versions (2.1.0, 2.2.0, 2.2.1, 2.3.0) to pinpoint the extact version where you see the behaviour change?

Is there any way flag that can be used to ignore the footer text?

Unfortunately, no. I'm not opposed to adding ways to target or ignore parts of a page, but for now there's no option to do it.

yob avatar Jan 13 '20 12:01 yob

I guess it's possible one of the bug fixes in those versions means some text that was accidentally skipped is now being extracted?

Yes, i too think so.

I found the change since version 2.1.0

Unfortunately, no. I'm not opposed to adding ways to target or ignore parts of a page, but for now there's no option to do it.

Okay, thank, that answers my query. I'll see if I can find another way to identify footer text.

bhagyashriSawkar avatar Jan 13 '20 13:01 bhagyashriSawkar

Given the changes starts in 2.1.0, I'd guess it might be a result of this commit a8ca5dc

yob avatar Jan 13 '20 13:01 yob

I'm going through the code to add a flag for footer. Can we only ignore the last line of every page if say "ignore_footer" flag was present? Will I have to consider any other scenarios? Eg. footer can be of two lines, no footer present but the flag was set

bhagyashriSawkar avatar Apr 27 '20 06:04 bhagyashriSawkar