Robert Sachunsky

Results 735 comments of Robert Sachunsky
trafficstars

Not sure if this is the right place for a discussion, but IMO this is _not_ the right approach for efficient prediction yet. We should define a tf.data pipeline, allowing...

Sorry, in my previous comment I was thinking more about Eynollah than the Binarizer (hence the heavy CPU part). And @apacha's PR does already speed up by an order of...

tf.data pipelining with heavy CPU processing itself seems to be hard to get right: to get true parallelisation, one probably needs [tfaip](https://github.com/Planet-AI-GmbH/tfaip#data-pipeline-1)...

What I describe happens on TF 2.13.1, which should be fully supported. This issue is a show-stopper for me, as with OCR-D, it's not even possible to keep the results...

Spoiler: I know how to do this. Would you care for a PR?

> > But how to save a global JSON report in the METS? It would not "manifest a physical page" which OCR-D seems to demand for any file > >...

BTW I believe having a measurement of CER standard deviation or variance is also useful. See [here](https://github.com/ASVLeipzig/cor-asv-ann/blob/0ae6867eba39f73f5832b219f09f71788145d1c2/ocrd_cor_asv_ann/lib/alignment.py#L414-L433) for an implementation.

Also, I wonder if this is even needed – #48 already covers prediction of a directory...

> > @cneud, yes, the issue can be solved with substitutions which can be configured by the users. > > Exactly. I would like to point out here that allowing...

> I just want to throw in some doubt on the belief that CERs are somehow comparable when produced by different tools. Do they count whitespace the same way? grapheme...