pdfsearch icon indicating copy to clipboard operation
pdfsearch copied to clipboard

heading_search is reporting the incorrect line_num

Open mdietz3 opened this issue 3 years ago • 2 comments

I have tested this with multiple PDFs that were loaded into R as character vectors. In particular there is a PDF (character vector) that has a "CONTENTS" page on page 6. When previewing the text using head(text) the 6th element (page of the text) is the contents page. When searching for it using

heading_search('text',"CONTENTS")  

returns keyword page_num CONTENTS 7 I tried using the function directly with the source PDF and the same result occurs.

mdietz3 avatar Feb 09 '22 19:02 mdietz3

Thanks for submitting this, this is a holdover from some modification to the code previously. I'll fix this in the dev version soon.

lebebr01 avatar Feb 11 '22 19:02 lebebr01

@lebebr01 great thanks for fixing it. To add some context it seems the issue is with a blank page. The blank page shows as "" when looking at the document using head(document) in R. In the document with the issue the first 3 pages have text, the 4th is blank, the next 2 have text (the 6th page is the table of contents). Using heading_search I find the other pages correctly until the blank page. Even removing the blank page does not fix the error. If I remove pages up to and including the blank page it works correctly. For some reason I think the blank page is being counted twice or alters the page numbering.

mdietz3 avatar Feb 11 '22 21:02 mdietz3