filak

Results 20 comments of filak

@zuphilip I am fine with the following: The LeftMargin|RightMargin will need some testing - are there any **real** example files available ?

There is a new DEV version of alto__hocr.xsl which supports TopMargin & BottomMargin: https://github.com/filak/hOCR-to-ALTO/blob/master/dev/alto__hocr.xsl If it works fine I will update the production file.

@kba I do not see any content in the margin elements - there will be no output produced by the transformation. I think the Top and Bottom margins have been...

There is a new DEV version of hocr__alto4.xsl for testing which supports TopMargin & BottomMargin: https://github.com/filak/hOCR-to-ALTO/blob/master/dev/hocr__alto4.xsl If it works fine I will update all the production files.

I have updated the master a while ago, just forgot to let you know... https://github.com/filak/hOCR-to-ALTO/commit/61bb10e6f36a6b9c65776013e2dd22a52db3575c

Thanks to @wush978 I was able to locate the missing jars. Just download it and copy it to the ...\R-3.6.2\library\mailR\java\ folder. Important - restart R after copying the jars. The...

This https://www.jqueryscript.net/demo/jQuery-Drag-drop-Sorting-Plugin-For-Bootstrap-html5sortable/ works in Firefox 88.0.1

Sure there is a workaround - pre-populate the data with sort_key() values. But this might not be always optimal/feasible. ``` from operator import itemgetter from pyuca import Collator coll =...

My point is that Tesseract outputs language info into hocr but in alto there is none. There is some conditional logic - if there is paragraph_lang => no lang output...

Take a look at [this](https://nbviewer.ipython.org/urls/gist.githubusercontent.com/inodb/c030d765460b0ed9e616/raw/4efe8ddf44ac22c9005bcacaebf0e1d28b3272a6/pygal_demo) example