Patrice Lopez

Results 601 comments of Patrice Lopez

Current state of thoughts about a JNI for pdf2xml: pros: - doable: a JNI for pdf2xml looks doable using C++ memory mapped files - PDF files can be loaded by...

Hi, I'm also interested in importing Grobid as a Java dependency in Scala code for Spark. Is this feature on the roadmap @kermitt2 ? Re-hello @xegulon :) Is it different...

@bnewbold Thanks a lot for the improvement ! I think there are small issues in the new regex and I propose some changes in the review after some tests, if...

Ok sorry I think I did not submit the review, can you see my comments now?

Hi @JOscarJ ! Thanks for the issue. I confess that I am never testing the jaxb binding after updating the XML schemas, because I am not using it (and I...

Hi @elonzh ! Crossref for consolidation is just provided for convenience, because it does not require an installation and is very easy to set for casual cases. See https://github.com/kermitt2/grobid/issues/616#issuecomment-669845787 I...

Hi @karatekaneen ! Actually the limitation is also the number of examples to be used during the training. The total number of training examples is partitioned according to the number...

Hi @lucbouge ! I think you're talking about the generated training data files? Normally these files must be reviewed manually, so the xml errors are hurting so much... just a...

The two formats can be confusing! There is normally no `` tag outputted in the final result and by the python `grobid_client` (this tag does not exist in the class...

Hi @Show-han ! There's an open issue for this https://github.com/kermitt2/grobid/issues/844 We can try to boost a bit the implementation in the next week for release 0.7.2.