Robert Sachunsky
Robert Sachunsky
It would be useful to have a dedicated processor for DPI estimation in OCR-D. That's because we [cannot rely on DPI metadata](https://github.com/OCR-D/spec/pull/129), although we need to. (Most Ocropy segmentation steps...
For example, in … |raw predictions|NMS result| |---|---| ||| … a 57% cross-column region does not get suppressed by two (mutually non-overlapping) column regions with 91% and 74%.
Since Detectron2 based models for textline segmentation have appeared, we should support them by adding a `operation_level=region`. Also, the `cat2class` mapping must be extended by `TextLine` type to allow its...
We currently only use Detectron2's `DefaultPredictor` for inference: https://github.com/bertsky/ocrd_detectron2/blob/0272d95a930d5136bba29e530a3530c13ab17166/ocrd_detectron2/segment.py#L126 But the [documentation](https://detectron2.readthedocs.io/en/latest/modules/engine.html?highlight=defaultpredictor#detectron2.engine.defaults.DefaultPredictor) says: > This is meant for simple demo purposes, so it does the above steps automatically. This is...
We frequently have the use-case where some (or even all) the file references have not been downloaded yet. But these URL references for images make OcrdBrowser stumble: ``` today at...
Fix #52
fixes a rare bug where reading order display could crash: ``` Traceback (most recent call last): File "/data/ocr-d/ocrd_all/venv/lib/python3.7/site-packages/ocrd_browser/util/gtk.py", line 109, in _run callback() File "/data/ocr-d/ocrd_all/venv/lib/python3.7/site-packages/ocrd_browser/util/gtk.py", line 66, in __call__ self.callback(*self.args,...
Since the recent changes regarding image selection, I am getting the following: ``` Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/ocrd_browser/model/page.py", line 102, in get_image page_image, page_coords, page_image_info = ws.image_from_page(self.page, self.id,...
It would be great if it was possible to hit ctrl+f to search for certain text strings or element IDs in large pages.