Konstantin Baierer

Results 302 comments of Konstantin Baierer

The default output viewed in chromium/chrome with the [Deluminate](https://chrome.google.com/webstore/detail/deluminate/iebboopaeangfpceklajfohhbpkkfiaa) extension works very well for me.

Khmer or Arabic? In principle, it's the same process for every language, you need pairs of lines images and their transcription. AFAIK Khmer is a left-to-right script with distinct characters,...

> Can you recommend a tool for this purpose? * ocropus-gtedit * ketos transcribe (in kraken) * https://github.com/qurator-spk/neath (server based) * https://github.com/UB-Mannheim/ocr-gt-tools (server-based ocropus-gtedit) > Can it be done via...

I've tested it now, unit tests pass and I managed to extract image-text pairs from the kant_aufklaerung_1784 sample in assets: ```sh $ python3 ./generate_sets.py -d ../assets/data/kant_aufklaerung_1784/data/OCR-D-GT-PAGE/PAGE_0017_PAGE.xml -i ../assets/data/kant_aufklaerung_1784/data/OCR-D-IMG/INPUT_0017.tif [SUCCESS] created...

> @kba Do you know of any Devanagari or any other Indic language datasets in Page XML format? I only have scanned page images and and their groundtruth in text...

> I wonder if this causes any irritations? I don't think so but I deleted the branch since it is outdated as you say.

Looks a bit like output of https://github.com/eddieantonio/ocreval or https://github.com/impactcentre/ocrevalUAtion but `PosixPath` implies it's a Python tool producing this.

It works for me. Are you sure you recompiled the library and copied it to `pylsd/lib/linux/liblsd.so`? C.f. https://github.com/kba/pylsd/tree/recompile-16 which contains the API change you hinted at and a recompiled liblsd.so:...

> Maybe we should add such rules, e.g. via Cython.Distutils.build_ext. Care for a PR against your fork/branch, @kba? Sure, the more complete #12 is, the better. My first idea would...

73648c6 is gone now and pypi-fork up-to-date.