fromthepage icon indicating copy to clipboard operation
fromthepage copied to clipboard

pdf export includes OCR text for pages marked blank

Open saracarl opened this issue 11 months ago • 0 comments

I am having an export issue with the following work: On the shelter places of blood sucking biting midges (Diptera:Ceratopogonidae) Culicoides and Leptoconops in the steepe zone of the Ukraine = O mestakh ukrytii krovososushchikh mokretsov (Diptera, Ceratopogonidae) Culicoides Leptoconops v stepnoi zone ... (Ukrainian Collection) | FromThePage

I tried exporting several formats, and I keep getting the original OCR characters on the blank pages, despite them being specifically marked as blank. However, they do not show up in the FTP interface. I attached the HTML export of the work.

I’ve been exporting a few of the works from this collection today, and this seems to be the only one with this issue so far. I’ve been exporting from the following location:

I was able to reproduce this by downloading the pdf export of the work. Page 4 is particularly egregious cat85814661_20240403194123.pdf

saracarl avatar Apr 03 '24 19:04 saracarl