Patrice Lopez
Patrice Lopez
Hello @rajeshkumargp ! Thanks for reporting the problem, could you add the PDF (or send it to me by email if it is not public) so that we can reproduce...
Reading order issue, the title comes at the end of the page in the PDF stream and for some obscure reasons it vanishes in the limbos.
Hi @frankrod It's an interesting observation. I think you're correct, currently only the title level `a` and title level `j` (journal title) are used for consolidation. When a reference with...
In this text: >"For the moment, we are also not relying on transformer approaches incorporating layout information, like LayoutML (Xu et al., 2020), LayoutLMv2 (Xu et al., 2021), SelfDoc or...
Hi @olszewskip ! Thanks for the question. If you send me your email (my email is in the readme) I can invite you to a GROBID mattermost channel. Your scenario...
Thanks @lfoppiano ! xom was nice for the toXml() methods Given #184, we actually would even not need anymore onejar-style build.
Hello @jumenzel ! This was a subject of discussion at some point somewhere else, I think there is no issue because basically GROBID calls the pdfalto binary as external command...
Thank you very much @freethejazz for the update ! It's very appreciated that you keep an eye on the new release from CrossRef, because I am not very good for...
Thank you @keto33 ! In pdfalto, you can choose to extract and process images (embedded bitmap and vector graphics) or not with the argument `-noImage`. Grobid can extract or not...
I translate in English so that everybody can benefit from the issue. Ph.D. thesis and "habilitation" are not supported by the current header model, apart simple header metadata like title,...