Add CrossRef DOI provider
What this PR does / why we need it: This pr adds new doi provider - CrossRef. Implemented logic was mostly inspired by other providers such as DataCite and PermaLink.
Which issue(s) this PR closes:
- Closes #8581
Special notes for your reviewer:
Suggestions on how to test this:
In order to test this you need to have test crossref account and set all required properties, described in doc/sphinx-guides/source/installation/config.rst file
Does this PR introduce a user interface change? If mockups are available, please link/include them here: N/A
Is there a release notes update needed for this change?: N/A
Additional documentation: N/A
coverage: 20.083% (-0.06%) from 20.14% when pulling 6df751346325948935582836a69b6e5378fb1b06 on konradperlowski:8581-crossref-provider into 97508e61b9d1ba99c63412248a27f7bbbf99b575 on IQSS:develop.
Thanks for the PR! FYI: #10234 allows multiple PID providers to be run at once. To do that, it refactors from providers being beans to being dynamically loaded, one instance per type/account. It doesn't affect the internal logic of providers much, but there will be merge issues between this PR and that one. (It should also allow new provider types to be distributed as separate jars if that's of interest.)
"I would like to inform you about next two Dataverses repos from RODBUK family. First one (uken.rodbuk.pl) runs on a Crossref as DOI provider." -- https://github.com/IQSS/dataverse-installations/issues/227
Sounds like this code is being used in production! 😄
@konradperlowski - With the changes to allow multiple PID providers, this is going to need some refactoring. It should also be possible to package this as a separate jar if that's desirable, but minimally the provider to be adapted to no longer be a Bean and to use the factory pattern and new per-provider config options. The DataCite provider is probably still a good example to follow. If you have any specific questions, let me know - I'd be happy to walk through the multipid changes. Hopefully it isn't much work as the changes are primarily related to how things get initialized and there shouldn't be any real changes to the CrossRef specific logic.
@konradperlowski Have you had a chance to make the changes recommended by @pdurbin ? We are ready to review this.
Thanks for the PR! FYI: #10234 allows multiple PID providers to be run at once. To do that, it refactors from providers being beans to being dynamically loaded, one instance per type/account. It doesn't affect the internal logic of providers much, but there will be merge issues between this PR and that one. (It should also allow new provider types to be distributed as separate jars if that's of interest.)
It looks to me like #10234 merged in Mar but the [email protected] package release at 2023. Can I just build a jar to based on the currently dataverse-spi package? Do you have any sample project of building a jar?
@pdurbin https://github.com/IQSS/dataverse/pull/10235#issuecomment-2275083381
You can't just compile against the spi at this point for a couple reasons:
- no one has added the PidProviderFactory and PidProvider interfaces to it
- you'll probably end up using internal classes from Dataverse to generate the relevant metadata (and these also aren't in the spi)
I expect that further code changes could allow making a completely separate project to create a new PID Provider but as it is now, you should be able to use the new interfaces, compile against Dataverse itself, and, if desired, add a module to create a separate jar at the end (I basically did that to create a test exporter before we had put the relevant exporter interfaces in the spi. FWIW: It's even possible to create a jar by hand for testing once you're code works with the new interfaces.))
compile against Dataverse itself
Do you mean compile against source code, or is there a dataverse jar in maven repo?
I think you have to compile against the Dataverse source at this point. There is a parent pom - see https://guides.dataverse.org/en/latest/developers/making-library-releases.html and the io.gdcc releases - but that doesn't cover the Dataverse classes. @poikilotherm has been driving a lot of this and may be able to comment further.
Moving this to "on hold" stus, until it's ready. Please let us know. (and yes maybe @poikilotherm can comment to help this move along)
new PR created for the refactoring of this feature https://github.com/IQSS/dataverse/pull/10803