Marek Horst

Results 82 issues of Marek Horst

This is a https://github.com/openaire/iis/issues/1327 follow up. One of the ways to refine the `CommonAffSectionWordsVoter` voter accuracy was checking for country code compatibility between affiliation and organization. This could be understood...

activity: impl
functionality: affiliations

Originally requested on redmine: https://support.openaire.eu/issues/7685 We should start with a dedicated branch where we could place Ariadne mining code and run in isolation. Without integrating it with IIS main workflow...

Originally reported on redmine: https://support.openaire.eu/issues/7666 Apparently using report file location as a metric help: ``` $ http localhost:9091/metrics|grep processing_citationTextExtraction_docs # HELP processing_citationTextExtraction_docs location:/user/dnet.beta/iis/working_dirs/primary/report # TYPE processing_citationTextExtraction_docs gauge processing_citationTextExtraction_docs{instance="",job="iis",user="dnet.beta"} 1.0487217e+07 processing_citationTextExtraction_docs{instance="",job="iis",user="dnet.production"}...

functionality: execution reports

One another https://github.com/openaire/iis/issues/1327 follow up. During the debugging process it turned out it is extremely difficult to pinpoint: * the exact affiliation (number) of the document which allowed to produce...

functionality: affiliations

This task is about incorporating python3-compatible madis version.

Once we pass through an ongoing, experimenting phase we should introduce a mechanism allowing blacklisting plaintexts coming from Springer to avoid processing it and generating any mining results involving blacklisted...

This is a direct follow up of #1292 and #1293 tasks. According to the documentation available in our gitlab repository (hosting the deployment configuration): https://git.icm.edu.pl/openaire/iis-deployment/-/wikis/Deployment-by-Jenkins https://git.icm.edu.pl/openaire/iis-deployment/-/wikis/Integration-tests-by-Jenkins there are specific user...

deployment

We should focus on a set of fields exported by IIS. I cannot share the spreadsheet link here because it looks like currently it can be edited by anyone.

activity: documentation

This task is related to the integration of "text mining algorithms to extract OpenAIRE data that link chemical ingredients of food and cosmetics to allergies, irritation, cancer, and toxicity". Once...

Currently we are able to match over 14 mln citations out of 101 mln bibliographic references on beta infrastructure using direct citation matching algorithm (matching based on external identifiers such...

activity: impl
functionality: citation-matching