iis
iis copied to clipboard
Implement a workflow responsible for importing HTML landing pages obtained by the PDF Aggregation System and sending them to the mining algorithm
Originally requested in redmine: https://support.openaire.eu/issues/9871#note-10
The idea is to implement and integrate a workflow responsible for:
- reading HTML landing pages from tar.gz packages stored by the PDF Aggregation System in S3 location
- sending those files along with ids (OpenAIRE id and DOI) to an affiliation matching algorithm (version prepared for this purpose by Myrto)