browsh icon indicating copy to clipboard operation
browsh copied to clipboard

add support for mozilla web pdf viewer

Open brudolp opened this issue 4 years ago • 1 comments

just tried to view a pdf file with https://github.com/mozilla/pdf.js#online-demo using browsh, sadly most text is unreadable.

brudolp avatar Jun 28 '20 14:06 brudolp

Interesting, thinking about it.

Yeah its pixelated graphics. I think it would be best to download and use a proper GUI viewer.

However, a potential solution I think has to have the actual text stored in the PDF I.e searchable. Then open using less, as long pdftotext is installed less can handle it.

less file.pdf

Some issues with this method.

  • PDFs would need to be searchable i.e contain the text to extract. Not all are.
  • A graphic PDF e.g a scanned book or something printed through a PDF driver to file that isn't searchable would need to go through an OCR first in-order to get the text to extract. Then there might be errors needing correction.
  • A lot of searchable PDFs out there have poor quality text in them i.e no-one has corrected errors etc.

andrewcrook avatar Jan 11 '21 16:01 andrewcrook