grobid_client_python icon indicating copy to clipboard operation
grobid_client_python copied to clipboard

I can find nothing in output directory!

Open Mayar2009 opened this issue 4 years ago • 3 comments

hello! I am sorry if I miss the obvious but I did like written first I have win10 , and iam working from powershell using cmd can not implement the instructions steps: 1)PS C:> git clone https://github.com/kermitt2/grobid-client-python 2) PS C:> cd grobid-client-python 3) PS C:\grobid-client-python> python3 grobid-client.py --input ~C:\gropid\output --output ~C:\gropid\output processFulltex

  1. then go to output file finding nothing! I have just two pdf in input directory

Am I missing something?

Mayar2009 avatar Jan 28 '20 09:01 Mayar2009

Hello @Mayar2009 !

First you need a Grobid server running - ok I admit, this can be considered as obvious from my point of view ;)

You can check https://github.com/kermitt2/grobid and the documentation https://grobid.readthedocs.io/en/latest/Install-Grobid/ to get a server running

or

you can just use the public demo grobid instance - if you don't have too many pdf to process and you don't care about the cool sub-second parallel PDF processing ... For this you can just modify the config.json of the present client file as follow:

{
    "grobid_server": "cloud.science-miner.com/grobid",
    "grobid_port": "",
    "batch_size": 1000,
    "sleep_time": 5,
    "coordinates": [ "persName", "figure", "ref", "biblStruct", "formula" ]
}

then if you have a decent internet connection and don't care about network latency, you will save time skipping the server install.

Thanks for the interest in Grobid !

kermitt2 avatar Jan 28 '20 19:01 kermitt2

thanks for your response! Actually I do have a lot of pdf so I am interested in using it as python-client I did all you have mentioned nothing worked for me I have been reading between two repositories of gorbid for three days actually

what is the required version of java? I have javac 1.8.0_241 after I installed jdk8 like documentation mentioned please tell me what is the right way to build Grobid and use python-client I have been trying to understand what is going on for three days then get the result of this command python3 grobid-client.py --input ~/tmp/in2 --output ~/tmp/out processFulltextDocument

I have read the issues and tried many solutions it will be so long to write everything here so I decide it will be better to know the sequence here I have windows 10 I use power shell because cmd did not work with instructions I do not know why java version javac 1.8.0_241 in short following the documentation I got

PS C:>git clone https://github.com/kermitt2/grobid.git --ok PS C:> cd C:\grobid PS C:\grobid:> ./gradlew clean install test got the error org.grobid.trainer.evaluation.EvaluationUtilitiesTest > testTokenLevelStats4_realCase FAILED java.lang.AssertionError at EvaluationUtilitiesTest.java:224 org.grobid.trainer.evaluation.ModelStatsTest > testInstantiation_realCase FAILED java.lang.AssertionError at ModelStatsTest.java:37

40 tests completed, 2 failed, 1 skipped

Task :grobid-trainer:test FAILED

FAILURE: Build failed with an exception.

What went wrong: Execution failed for task ':grobid-trainer:test'. There were failing tests. See the report at: file:///C:/grobid/grobid-trainer/build/reports/tests/test/index.html

Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

Get more help at https://help.gradle.org

Deprecated Gradle features were used in this build, making it incompatible with Gradle 6.0. Use '--warning-mode all' to show the individual deprecation warnings. See https://docs.gradle.org/5.4.1/userguide/command_line_interface.html#sec:command_line_warnings

BUILD FAILED in 5m 48s 38 actionable tasks: 38 executed

then I installed another JVM version(the latest version) I tried to know the gradle version I have then ./gradlew --version Gradle 5.4.1 Build time: 2019-04-26 08:14:42 UTC Revision: 261d171646b36a6a28d5a19a69676cd098a4c19d

Kotlin: 1.3.21 Groovy: 2.5.4 Ant: Apache Ant(TM) version 1.9.13 compiled on July 10 2018 JVM: 13.0.2 (Oracle Corporation 13.0.2+8) OS: Windows 10 10.0 amd64 the same errors i got again it seems the problem is related with gradle as I understand but I did not understand how to solve it

Mayar2009 avatar Jan 28 '20 21:01 Mayar2009

Just seen this now, two points:

  • Windows is not really supported as a platform for grobid, if you can't get it run, I suggest you to opt for the docker image (https://grobid.readthedocs.io/en/latest/Grobid-docker/)

  • If the error is a failing test, it can happen on different platform, especially on Windows... you can skip the tests ./gradlew clean install -x test

lfoppiano avatar Aug 19 '20 01:08 lfoppiano