dataverse
dataverse copied to clipboard
Exclude deaccessioned dataset from being harvested
We'd like deaccessioned datasets to be excluded from being harvested. I have tried to create an OAI set by setting the Solr field publicationStatus:Deaccessioned in the OAI set query; cf. this thread in the Dataverse Google Group. Please have a look into this and enable deaccessioned datasets from harvesting. Thanks!
Just noting that this problem might have been fixed since this GitHub issue was written. UVA Dataverse, running v5.9, harvests records from Harvard Dataverse Repository, and @shlake reported today that a recent harvesting run removed datasets that had been deaccessioned since the last time that UVA Dataverse harvested from Harvard Dataverse Repository.
Just noting that this problem might have been fixed since this GitHub issue was written.
Nice.
@philippconzett are you able to re-test?
Sorry for my late reply, @pdurbin. We are currently migrating DataverseNO to the cloud, and during that process we won't be upgrading from the current version we use in production. Once, we've migrated, we'll be able to test. Maybe someone else in the community could test this already now? Thanks!
sizing:
- This one may already be fixed.
- This is going to be tested.
- Sized at a 3. seems ver small to run the test.
- Will be closed if the tests are good without further processing
- Test with 5.12.1
If tests pass:
- Note that this was tested in version 5.12.1
If the tests fail
- This issue will get re-assessed and stay open.
- It will be resized at that point and re-queued.
When this is tested make sure the harvesting is going from one dataverse to another, both of the same latest software version. Julian has called this out as important because part of the issue may be that the two dataverse software versions may have been different. The current version should be 5.12.1
Wh
The plan:
- This is not going to go onto the board.
- Stephen is going to take a look at this today and determine if it's actually still broken.
- If broken - it stays open and works it's way into the backlog.
- If fixed stephen is going to close it.
The OAIRecordServiceBean filters out those datasets that are fully deaccessioned as of 9/2018. apparently this has been fixed for a long time. closing.