marker icon indicating copy to clipboard operation
marker copied to clipboard

I hope to find a way to remove headers and footers

Open bjtangseng opened this issue 1 year ago • 5 comments

I used the marker project and felt that it was very good. I don't know if it was a problem with my use or if I didn't pay attention to some details. I hope to find a way to filter out PDFs without footers, because the content in those areas is generally some irrelevant badges or some common languages. I don't know if a parameter can be added to reduce the interference of these useless information on the results of file conversion.

Thank you.

bjtangseng avatar Dec 19 '24 12:12 bjtangseng

Can you please share an example PDF?

VikParuchuri avatar Dec 19 '24 17:12 VikParuchuri

Thank you very much for your reply. I will give you a sample file. This file is a PDF file that can be searched publicly in China and does not involve confidentiality issues. You will find that the header of the first page will have a logo and the address of the organization that wrote this file. From the second page, there will be some small headers with logos. Some files will also have some footers, mainly some information such as the organization introduction and disclaimer.

I hope to add a parameter to skip this information, because I see that Surya can analyze the layout and also give clear footer and header positioning areas. Can it be used as an exclusion item and not perform corresponding identification and operations?

Thank you

fileView.pdf

bjtangseng avatar Dec 20 '24 06:12 bjtangseng

Is there any progress? I have the same requirement.

myg133 avatar Jan 07 '25 08:01 myg133

Has there been any progress? I also need this feature

zw-0625 avatar Jan 10 '25 03:01 zw-0625

Same here. Any updates?

Pipasgonzalez avatar Mar 11 '25 11:03 Pipasgonzalez