dinglehopper
dinglehopper copied to clipboard
Generate per-workspace CER + WER
- It's easy to calculate this from the individual CER/WER and the character/word counts.
- But how to save a global JSON report in the METS? It would not "manifest a physical page" which OCR-D seems to demand for any file
But how to save a global JSON report in the METS? It would not "manifest a physical page" which OCR-D seems to demand for any file
@kba @bertsky @cneud Any thoughts on this?
But how to save a global JSON report in the METS? It would not "manifest a physical page" which OCR-D seems to demand for any file
@kba @bertsky @cneud Any thoughts on this?
Yes, this became possible when we agreed on an "official" way to have global (document-wide) files in the METS. You just put it in there without a pageId (i.e. without a reference of the file in the structMap) and use a certain convention for the file ID (with FULLDOWNLOAD IIRC).
Now that different MIME types are allowed in fileGrps, having an output fileGrp with page-wise and document-global reports should be no problem.
BTW I believe having a measurement of CER standard deviation or variance is also useful. See here for an implementation.
(Closing the issue was an accident, I often hit the wrong buttons)