Patrice Lopez
Patrice Lopez
Hi @Sukii mmm I am not aware of such an existing tool. I guess the context of the question is not OCR (superposing ALTO information on its corresponding image PDF),...
Thank you ! this is fixed with #112
Hi @amensiko ! Thanks a lot for the nice words on Grobid and the PR ! If I understand well, the `download` option you introduce is actually a "write" option....
Thank you ! this is fixed with #112
Thank you for the error case and #108 - the error was introduced with the processing of line numbers... this is a priority on my next iteration on pdfalto.
Hi @WolfgangFahl Thanks a lot for the issue! I have opened PR #19 based on your proposal to catch the error when printing a file name with invalid unicode bytes....
Thanks for the feedback @WolfgangFahl ! The idea with this client was to provide an example of usage of the GROBID REST API for parallel processing, easy to adapt to...
Thank you ! Normally the corresponding delete at the end of `TextPage::createPath` should not be present, and could lead to a crash when applying later xmlFreeDoc(). Fixed with 2d1bafa25091c1f7ea34d89ea8510273b36af455
Hello @bmorton1 ! Good question. Currently in pdfalto we use RGB format for color, so no alpha channel for transparency (like ARGB), following the ALTO specifications: ```xml Font color as...
see #115, work in progress to have something well packaged.