grobid icon indicating copy to clipboard operation
grobid copied to clipboard

Problem coords with "processFulltextAssetDocument"

Open ayhama16 opened this issue 2 years ago • 6 comments

Hi, when I use this command with "processFulltextDocument" everything is fine curl -v --output "test.zip" --form input=@./test.pdf --form segmentSentences=1 --form teiCoordinates=ref --form teiCoordinates=s localhost:8070/api/processFulltextDocument

, but when I try to convert with "processFulltextAssetDocument" , there are no coordinates except on images: curl -v --output "test.xml" --form input=@./test.pdf --form segmentSentences=1 --form teiCoordinates=ref --form teiCoordinates=s localhost:8070/api/processFulltextAssetDocument

Please tell me if i am wrong somewhere? Or its s bug. Thank you!

ayhama16 avatar Feb 24 '22 11:02 ayhama16

I have made a changes on these two files and now it works perfect.

grobid/grobid-service/src/main/java/org/grobid/service/process/GrobidRestProcessFiles.java grobid/grobid-service/src/main/java/org/grobid/service/GrobidRestService.java

GrobidRestService.java.txt GrobidRestProcessFiles.java.txt

Is there a way to add an option for Batch processing, with "-teiCoordinates" limitation? " java -Xmx1024m -jar grobid-core/build/libs/grobid-core-0.5.0-onejar.jar -gH grobid-home -dIn /path/to/input/directory -dOut /path/to/output/directory -teiCoordinates -exe processFullText "

like this

"curl -v --output "test.zip" --form input=@./test.pdf --form teiCoordinates=ref --form teiCoordinates=biblStruct --form teiCoordinates=persName --form teiCoordinates=figure --form teiCoordinates=formula --form teiCoordinates=head --form segmentSentences=1 --form teiCoordinates=s localhost:8070/api/processFulltextAssetDocument"

Thank you!

ayhama16 avatar Feb 24 '22 13:02 ayhama16

Thanks ayhama16. I am also looking for its solution.

officialsuyogdixit avatar Mar 10 '22 04:03 officialsuyogdixit

@kermitt2 Would it be possible for you to use the fix by @ayhama16 and create a new docker image on docker hub? Thanks!

Jacob-Jan avatar Mar 25 '22 08:03 Jacob-Jan

Hi @Jacob-Jan !

I am working on a new release with many other changes (10 months of changes), the new docker image will be available with the new release and will include these updates of the processFulltextAssetDocument service.

kermitt2 avatar Mar 25 '22 12:03 kermitt2

Hello @kermitt2, Sounds great, thanks! Any indication of when that will be, approximately?

Jacob-Jan avatar Mar 25 '22 12:03 Jacob-Jan

Hi @Jacob-Jan, the new docker images are available.

kermitt2 avatar Apr 20 '22 11:04 kermitt2