gImageReader
gImageReader copied to clipboard
Remember boxes set by the user
Assume I have fifty images, each representing the result of scanning two pages of a book. Now I would like to prepare each image by manually defining the boxes. Then I would like to start the OCR and leave the machine and drink a coffee, till it's done.
Currently, as soon as I select another image-file, the boxes I set on another file, get forgotten. Ideally this could be saved as a document, so,, should I not find the time to finnish the 'boxing', the next time I could load the last project as a gImageReader document, which already has this set up. However, I am aware, that this takes a additional development time and consider the document creation as luxury ;-) The first thing, to have gImeageReader remember the boxes throughout the files within a session, would be really appreciated.
You should already be able to do this:
- Select all images you wish to recognize in the sources pane. This will kinda emulate a multi-page document.
- When recognizing, choose "Multiple pages", and as recognition area choose "Current selection" Then the same selection area is recognized on all selected pages.
Does this solve your issue?
No, I was thinking about individual selection areas for different images.
Ah I see. One of the plans I have after version 3.2.0 is to introduce a scripting interface (see also #103), I wonder if this would also be best handled with such a functionality. Another possibility would be to indeed store the selection area in the image session data (i.e. along with rotation etc), and let the user choose between "Selection of first page" and "Stored selection" (and autodetect layout) when doing multi-page recognition.
I would store it in the image-session data and expose that data to the scripting interface, so it can be reused there.
Hi. Any news on this? Most documents have different layouts on different pages, and in my experience the auto layout detection is next to useless, so I have to manually place the selection boxes on each and every page. And since the boxes are not saved with the page, I cannot recognize more than one page at a time... I've been a medium to heavy document OCR user for over 15 years, and I tell you that most documents we need to OCR fall under this category. Cheers.