bookworm icon indicating copy to clipboard operation
bookworm copied to clipboard

The ability to save the scanned OCR book as a PDF or Word document, not just a text file

Open DraganRatkovich opened this issue 2 years ago • 3 comments

Bookworm currently allows the user to save a scanned book as a plain text file, which is inconvenient in some cases, as either Word document or pdf file formats are currently widely used.

Describe alternatives you've considered

Allow the user to save the scanned book in either pdf format or Microsoft Word document format, which, in turn will give more options in the resulting file for editing in word processing programs. This can be done in the following ways:

  • Create additional .pdf and .docx file formats along with the .txt format in the "Save As" dialog box to allow the user to choose from the available file formats;
  • Create a submenu in the file menu called "Export As" and put the three formats there, .txt, .docx and .pdf, to quickly select and simply enter a file name and save in the previously selected file format.

@mush42 Let me know your thoughts whether this is possible or not.

DraganRatkovich avatar Feb 27 '22 07:02 DraganRatkovich

@DraganRatkovich It is possible, of course. But I couldn't see any benefit of those two formats over plain text. No structure information is extracted from the document, except pages and lines. No headings, no paragraphs, and no formatting information. You can copy the text from the text file and paste it in word, and word will restore paging and lines. Best Musharraf

mush42 avatar Feb 27 '22 08:02 mush42

@mush42 Of course, but the main advantage of direct saving as pdf or docx is time. It may take a long time to process in Microsoft Word the contents of the extracted text file, especially if the book being scanned contains more than 300 pages.

DraganRatkovich avatar Feb 27 '22 08:02 DraganRatkovich