Patrice Lopez

Results 77 issues of Patrice Lopez

Nice work, thanks ! Using GloVe embeddings as indicated and increasing the number of epochs to 70 without touching anything else, I obtained a f-score of 89.16 averaged over 10...

An error case for accent composition in pdfalto, see https://github.com/kermitt2/grobid/issues/906 for the pdf.

bug

By default pdfalto extracts both embedded bitmaps and vector graphics. The option -noImage avoids extracting both graphics types. However we might want still the vector graphics extracted and not the...

enhancement

We use currently simple formatting patterns like `%1.4f` to serialize the coordinates in the XML and SVG files (avoiding `e` formatting that can introduce an exponential). The drawback is that...

enhancement

For some reason, the rotation attribute which was present in pdf2xml and which is still computed, is not outputted in the ALTO file presently. If I remember well, we though...

enhancement

> It would be great if you would consider option to include Glyph/Character level in the output. For the moment only token-level output is implemented.

enhancement

The quantity CRF model recognizes numerical expressions with exponents on 10 (in particular distorted one due to PDF text extraction): ![example_exponent](https://cloud.githubusercontent.com/assets/2340795/13862621/c63b7d90-ec94-11e5-8c5b-9f70a2385489.png) However we are not currently parsing it (in their...

bug
implemented

In this example, the raw value looks good but the parsed value is not very exciting. ![Screenshot from 2019-11-26 22-32-41](https://user-images.githubusercontent.com/2340795/69675821-d786c680-109f-11ea-8d40-48271692d192.png) [1001._0908.0054.pdf](https://github.com/kermitt2/grobid-quantities/files/3894162/1001._0908.0054.pdf)

bug
implemented

Values can be entirely numerical, use exponent of 10s (see #7) or exponent symbol (0.2E-4), number words ("twenty") (see #8), dates/time expressions ("October 19, 2014 at 20:09 TDB") (see #12)....

enhancement
task

e.g. **silicon nitride powder** for the measurement **10kg** in: ``` A mixture of 10kg of _silicon nitride powder_ was charged into the mixing chamber 20 of the mixing vessel 18....

enhancement