Results 115 comments of Daniel Ecer

Just to clarify or provide the context, I noticed npm complaining about security issues for other repos (`sciencebeam-texture` and `peerscout`), whereas there is silence from GitHub. That is why I...

Hi, just wondering what the plan is with this PR?

Just wondering whether that is something you'll likely pick up yourself any time soon? I have a number of documents with spacing issues (still the minority but not insignificant when...

Okay, thank you. I've raised https://github.com/kermitt2/pdfalto/issues/111 Let me know if you'd prefer to close this issue or keep it to track the issue in GROBID (e.g. to make sure it...

I noticed the same or something similar happens with the first PMC manuscript for example. However, the extracted XML looks a bit different depending on the parameters. The PMC evaluation...

Thank you for that. I think in that case the text was missing all-together and didn't appear as a figure or table description either. I guess it would be interesting...

This seems to be due to [FullTextParser](https://github.com/kermitt2/grobid/blob/0.6.2/grobid-core/src/main/java/org/grobid/core/engines/FullTextParser.java#L238-L302) processing figures and tables from the body only.

One example is [DOI 10.1101/306803](https://doi.org/10.1101/306803) or [306803v1](https://www.biorxiv.org/content/10.1101/306803v1) (from the bioRxiv 10k validation dataset). It has "Extended Data Figure 1" etc. I haven't tested whether they are going to get extracted...

Just some feedback from my side. I am using [sciencebeam-trainer-delft](https://github.com/elifesciences/sciencebeam-trainer-delft). I am using regular `pip` to install dependencies, including `jep`. The version of `jep` needs to match the one used...

Thank you for getting back quickly. I currently don't have any android dev tools setup. But could try to do that if that was helpful. As mentioned, the issue is...