thoth
thoth copied to clipboard
Automatically create Location entry in Thoth when a Dissemination workflow succeeds
On successful dissemination, add a new Location entry to the relevant Publication in Thoth, recording the URL(s) of the newly-created directory entry and/or copy of the content.
To be added to the existing Internet Archive/Figshare workflows (Crossref is not relevant as DOIs are already present in Thoth at time of submission). For Internet Archive, the Platform type INTERNET_ARCHIVE
is now available; for institutional repositories such as Figshare, OTHER
will need to be used.
There would then be potential for replacing the current logic for checking whether or not a Work already exists in the target platform, instead looking at whether a Location with the relevant Platform exists.
This can then be extended to new dissemination platforms when they are implemented. Ease of implementation may depend on individual platforms' workflows; Internet Archive returns the relevant URL to the dissemination script immediately on creation, but e.g. FTP-based workflows are unlikely to be as neat.
As part of this work, add something like PUBLISHER_WEBSITE
to the set of location platform types.
As part of this work, add something like
PUBLISHER_WEBSITE
to the set of location platform types.
Tracked separately now under #561
- [x] For each platform, determine the appropriate
landingPage
andfullTextUrl
to record in Thoth- [X] Internet Archive:
landingPage
=archive.org/details/[workId]
,fullTextUrl
=archive.org/details/[workId]/[filename].pdf
- [x] Figshare: note
figshare.com
(API) vsrepository.lboro.ac.uk
(UI) versions of same links, Handles, etc; treat as a "Figshare" upload or a "Loughborough repository" upload (repositories may migrate platforms)? - [x] CUL: tbd
- [x] Zenodo (or do under #542)
- [ ] OAPEN: works don't acquire these URLs until some hours/days after dissemination. Currently handled manually. Any alternative? Split out as separate task?
- [X] (Crossref: not relevant here)
- [X] Internet Archive:
- [x] Extend
disseminator
to (retrieve and) pass backlandingPage
andfullTextUrl
when they have been assigned via successful archiving- [x] this will need to be on a per-publication basis as we sometimes disseminate more than one format, so
publicationId
will also need to be passed back, or at leastpublicationType
- [x] this will need to be on a per-publication basis as we sometimes disseminate more than one format, so
- [x] Add script which takes
publicationId
,locationPlatform
,landingPage
andfullTextUrl
and writes location to Thoth- [x]
locationPlatform
could be supplied directly or derived from inputs todisseminator
- [x]
publicationId
could be passed back directly as above, or obtained from Thoth via e.g.workId
+publicationType
- [x]
- [x] Extend GitHub Actions to take output from each
disseminator
run and pass it to new script- [x] for dissemination of multiple formats, should the script be called multiple times, or should it handle multiple locations itself?
- [x] Determine whether any new
locationPlatform
s need to be added to Thoth- [x] e.g.
FIGSHARE
- or as above, should it be e.g.LBORO_REPO
? - [ ] Any way of marking/"locking" these
location
s as created by Thoth Dissemination Service/part of Thoth Archiving Network? - [ ] Is it still appropriate to permit only one
location
perlocationPlatform
for all of these? (e.g. users might independently upload copies to additional Figshare repositories, etc - not necessarily sensible but shouldn't go unrecorded)
- [x] e.g.
- [x] Catchup run: ensure that works disseminated prior to implementation of this feature all have appropriate locations created
- [ ] Could a similar mechanism be used (on a regular, automatic basis) to handle OAPEN, as above?
- [x] Add an appropriate set of Thoth credentials as repository secret (or organisation secret - would require permissions)