enabling discovery of WorkflowHub workflows from arXiv papers
By creating an arXiv Labs project to add links to WorkflowHub workflows in the arXiv paper interface.
https://info.arxiv.org/labs/showcase.html
Part of the submission process is a pull request like https://github.com/arXiv/arxiv-browse/pulls?q=is%3Apr+label%3ALabs
(Note: they just paused adding new projects to the arXiv Labs, but they plan on resuming submissions later; if we agree this is a good idea we can get ready now!)
- [x] Are there existing papers on arXiv linked to or otherwise referenced from workflows already on WorkflowHub? (Yes, see below)
- [ ] What is the best way to represent those connections in the metadata? Presumably the "publications" section, though that is not listed at https://about.workflowhub.eu/docs/metadata-list/ and it doesn't appear possible to add a new publication while editing workflow metadata.
- [ ] Can these paper references be automatically extracted from a submitted workflow or RO-Crate?
I found 3 existing WorkflowHub workflows that reference arXiv papers:
- https://workflowhub.eu/workflows/626 links to https://arxiv.org/abs/2209.02498 in the description but does not formally connect to that paper in the metadata. The arXiv paper does not mention WorkflowHub nor the DOI that was assigned to the workflow. Apparently this arXiv paper was further improved and published by GigaScience as https://doi.org/10.1093/gigascience/giad120 which does mention and cite the WorkflowHub workflow by name and DOI.
- https://workflowhub.eu/workflows/576 links to https://arxiv.org/abs/2212.01666 in
readme.mdfile in the uploaded snapshot, but that paper is not mentioned in the workflow description nor is it formally linked in the metadata. The arXiv paper does not link or mention the workflow on WorkflowHub, not by name, identifier, nor DOI. The paper was later published by Giga science at https://doi.org/10.1093/gigascience/giad094 which did cite the WorkflowHub workflow by name and DOI. - https://workflowhub.eu/workflows/1857 links to https://arxiv.org/abs/2502.06124 in the workflow description, but that paper is not formally linked in the workflow metadata. The arXiv paper does not link to, mention, nor cite the WorkflowHub workflow. It appears that the arXiv paper was later published by GigaScience at https://doi.org/10.1093/gigascience/giaf107 which did cite the WorkflowHub workflow by name and link but not by DOI (no DOI was issued by WorkflowHub).
Just so I understand correctly: it would be a kind of "widget" embedded on the page when viewing an arXiv paper, that would query some existing or new WorkflowHub API endpoint to get a list of all workflows related in some way to a paper (by its DOI), then display a list of links back to the relevant WorkflowHub entries?
Just so I understand correctly: it would be a kind of "widget" embedded on the page when viewing an arXiv paper, that would query some existing or new WorkflowHub API endpoint to get a list of all workflows related in some way to a paper (by its DOI), then display a list of links back to the relevant WorkflowHub entries?
Correct!
And maybe invite uploading a related workflow to WorkflowHub if there was no match