grobid-quantities icon indicating copy to clipboard operation
grobid-quantities copied to clipboard

Value parser sometimes not reliable with exponents

Open kermitt2 opened this issue 5 years ago • 2 comments

In this example, the raw value looks good but the parsed value is not very exciting.

Screenshot from 2019-11-26 22-32-41

1001._0908.0054.pdf

kermitt2 avatar Nov 26 '19 21:11 kermitt2

oh, maybe adding some simple post-processing would avoid such mistakes.

lfoppiano avatar Nov 27 '19 13:11 lfoppiano

I've implemented some rules in #103:

  • if whatever contained into is numeric, it goes into <number>
  • if <number> comes after <pow> or <exp> it's probably the exponent or part of it and it's concatenated to the previous
  • if <base> contains e then the following <pow> should go into <exp>

lfoppiano avatar Dec 25 '19 07:12 lfoppiano