metafacture-core icon indicating copy to clipboard operation
metafacture-core copied to clipboard

Switch OAI-PMH library in metafacture-biblio

Open fsteeg opened this issue 3 years ago • 0 comments

In metafacture-biblio, we depend on org.dspace:oclc-harvester2:0.1.12 (see details).

It's the only version of the OCLC harvester published to Central (see https://mvnrepository.com/artifact/org.dspace/oclc-harvester2). There is a GitHub repo at https://github.com/OCLC-Research/oaiharvester2 which contains a slightly newer version, but is not published to Central.

We came across an issue in the library while using it from OERSI, caused by a call in HarvesterVerb, resulting in duplicte logging output (see workaround). With our current setup, we have no way to properly fix issues like this. We should either depend on the OCLC harvester in a way that allows us to make changes to the code, or switch to a new library.

The OCLC harvester is used in a lot of projects on GitHub, many of which incorporate the code into their repos. The newest, maintained version of the original OCLC code seems to be in the oai-harvest-manager repo: https://github.com/clarin-eric/oai-harvest-manager/tree/master/src/main/java/ORG/oclc/oai/harvester2/verb. That repo however is not published to Central.

One option would be to set up a fork of the original OCLC repo with publishing to Central via GitHub actions. This would already give us the possibility to make changes to the code. We could also ask the oai-harvest-manager folks to contribute their version to that repo.

Another option would be to switch to a different library, like XOAI, which is published to Central.

Discussed with @dr0i: as a first step, we should have a look at XOAI to see if that works for us.

fsteeg avatar Feb 23 '21 08:02 fsteeg