Patrice Lopez issues

Results 77 issues of


Patrice Lopez

Revisited result format for aggregated sources

As we are moving to more heterogeneous sources, crossref is one bibliographical record among others. To keep everything well separated and avoid destructive merging, the headache of unified representations and...

enhancement

Experiment with alternative compression

We use snappy right now for LMDB stored records. There are other compression methods that might be relevant to small objects to get higher compression ratio and faster decompression (in...

enhancement

Experiment with vespa

Matching of full raw reference string provides the best accuracy but is also the most expensive, so scaling with this kind of queries supposes to add many elasticsearch nodes. It...

enhancement

Light response with OA and ISTEX ID

For the glutton web extension, it would be good to have a service that provides both the OA PDF access (as the current service/oa?) and the ISTEX ID when available.

enhancement

Enable caching

Keeping in mind that caching queries/results in LMDB make sense for matching queries only.

enhancement

Add an asbtract and MeSH classes look-up service

Some abstract are present in crossref metadata, but of course many are available via MEDLINE data with the nice MeSH classes. The sub-package `pubmed-glutton` parse all MEDLINE data and map...

enhancement

3em dash support in references

Chicago reference style has this awful usage of 3em dash to repeat one or several, or all, authors of the previous reference. Although this practice seems to be removed or...

New figure/table segmentation approach and models

Unstable and work in progress! (follow-up of the `fix-vector-graphics` branch) This is a working version for a revision of the cascade process in Grobid, which changes the overall approach for...

Review matching of journal names

The average time spent by FastMatcher (around 15% of the whole runtime) is particularly important for the journal names (11.3%). There are apparently too many short abbreviated concurrent journal names...

enhancement

Improve affiliation results by exploiting more Crossref consolidation

From extracted affiliation by Grobid, it would be interesting to try to validate/correct affiliation strings from the affiliation information possibly present in CrossRef records. In addition, when ROR are available...

enhancement