Clemens Neudecker

Results 137 comments of Clemens Neudecker

> would only count 1 error (lazy vs lazer). Exactly. Any words appearing in the GT and also in the stopword list are ignored when computing the "significant words" accuracy...

I am fine with what @bertsky proposed in https://github.com/OCR-D/spec/issues/108#issuecomment-503147346 too.

@wrznr So far I mainly applied [k-fold_cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)#k-fold_cross-validation), would you still see added benefits over this by partitioning into three sets?

@wrznr Do your remaining ``@wrznr requested changes`` relate to [this comment](https://github.com/OCR-D/spec/pull/105#pullrequestreview-197476416) only or is there other stuff that needs changing (for the time being)?

This was also a topic in [Europeana Newspapers](http://www.europeana-newspapers.eu/). See e.g. http://www.primaresearch.org/publications/ICDAR2013_Clausner_ReadingOrder http://www.europeana-newspapers.eu/wp-content/uploads/2015/05/D5.3_Final_release_ENMAP_1.0.pdf

This is only awaiting the updated guidelines, right? #80 is closed and I agree fully with https://github.com/OCR-D/spec/issues/40#issuecomment-421994713. For the main purposes of OCR-D we should avoid (modifying) the depths of...

It seems this went circle, but towards the general question of relaxing input/output fileGrp conventions: I would be fine with any form of relaxation that will still allows us to...

Just to let you know that I've been told today that [PMML](http://dmg.org/pmml/v4-3/GeneralStructure.html) is the widely accepted standard to describe ML models. It is XML-based. Perhaps we can learn/borrow some things...

@wrznr @kba @Doreenruirui This is pretty progressed https://github.com/Doreenruirui/okralact/tree/master/docs, https://github.com/Doreenruirui/okralact/tree/master/engines/schemas, no?

Ping @splet @chris1010010 for further opinions.