Stefan Weil
Stefan Weil
modern.ie is really nice. I did not know it before, but used it now to generate a screenshot of [the Louve image](http://merovingio.c2rmf.cnrs.fr/iipimage/iipmooviewer/louvre.html) with IE 11. [Here](https://www.browserstack.com/screenshots/aae6d2c0eb232579cdc22ef5c240a4c3b613874f/win10_ie_11.0.jpg) it is. It is...
[This page](https://www.browserstack.com/screenshots/77a3ca7dde26207ffe618deaa4fb6ca547194be8/win10_ie_11.0.jpg) shows the Orion nebula with IE 11 on Windows 10, produced by the snapshot function of modern.ie. Here the centered image is visible, but too small. The navigation...
See https://github.com/UB-Mannheim/ocr-fileformat/releases/tag/v0.3.0.
> Is there anything else we definitely want for a version 1.0.0 afterwards? See [milestone v1.0.0](https://github.com/UB-Mannheim/ocr-fileformat/milestone/3). Maybe other issues should be added there, too.
As soon as PR #127 is merged, only support for GCV (issue #125) is missing for milestone v1.0.0.
A new release v0.4.0 is now available.
It would be good to have a license statement from @glenrobson if we want to use that XSLT.
[Here](https://digi.bib.uni-mannheim.de/fileadmin/digi/511144768/alto/511144768_0008.xml) is an ALTO file generated with Tesseract (see https://github.com/tesseract-ocr/tesseract/pull/2067). [Another page](https://digi.bib.uni-mannheim.de/fileadmin/digi/511144768/alto/511144768_0009.xml) was processed by ABBYY Finereader. While ABBYY adds the `` tags, Tesseract (and ocr-fileformat) does not. As the...
The [ALTO documentation](https://www.loc.gov/standards/alto/techcenter/layout.html) says "A TextBlock is divided into lines and those are divided into strings, spaces and hyphens". I don't interpret that as a strict requirement that spaces are...
Clemens has created an issue for that: https://github.com/altoxml/schema/issues/54 (thank you).