content icon indicating copy to clipboard operation
content copied to clipboard

Add workflows and trainings ?

Open paulzierep opened this issue 1 month ago • 5 comments

Currently, we only list tools here, but logically workflows and training materials could also be considered part of research software — depending on how you define it. We are already collecting such information in https://github.com/research-software-ecosystem/micoreca and the Galaxy CoDex, and are thinking about how to consolidate that effort with the work being done here.

It is probably not a good idea to maintain multiple repositories that perform similar data-scraping tasks. Instead, these resources should ideally be centralized. For our use case, it would definitely be beneficial to include trainings and workflows in the atlas. So the open question is: where should this content be stored?

Let’s have an open discussion about it!

paulzierep avatar Nov 21 '25 10:11 paulzierep

Fully agree workflows, and maybe even training materials, should be there. Codex tools contents is already in the repo in https://github.com/research-software-ecosystem/content/tree/master/imports/galaxy. IMO I would see a import from e.g. IWC and/or WorkflowHub in https://github.com/research-software-ecosystem/content/tree/master/imports/iwc https://github.com/research-software-ecosystem/content/tree/master/imports/workflowhub.

hmenager avatar Nov 25 '25 12:11 hmenager

Each one of the registries and software indexes contributing to RSec are sometimes contributing not single tools, but workflows within a blackbox, toolboxes (sets of tools), workflows, whole indexes, etc... From my point of view there should be an annotation effort on each one of the contributors' side to tell "hey, this is a toolbox" or "this is a workflow" (without entering on the kind of workflow), because they (should) know much better than RSec what they are providing.

jmfernandez avatar Nov 26 '25 09:11 jmfernandez

This is two issues in one, but for the training materials, the link exists between TeSS and bio.tools (training materials tagged with bio.tools entries, query link from bio.tools to find all training materials tagged with a given software). For example, the link https://tess.elixir-europe.org/materials?q=BLAST on https://bio.tools/blast lists 45 training materials linked to BLAST. It may be interesting to capture, or count, the number of training materials for a maturity metric and expose this on OpenEBench. For bio.tools we could opt to not display the link if there are no training materials in TeSS. The hard part (or at least much more work) is tagging all material in TeSS with the relevant bio.tools entries. Maybe a topic for a BioHackathon?

magnuspalmblad avatar Nov 26 '25 09:11 magnuspalmblad

Training materials, as well as test datasets, related to a tool of any kind, are badly needed. They can be an indicator of software maturity, indeed. But they are represented in different ways from the different providers.

In the case of OpenEBench technical monitoring, as @redmitry can explain, several tool registries are ingested (explicitly fetching from bio.tools, bioconda, etc...). In the long term, what it would be desirable is having OpenEBench technical monitoring ingesting entries from RSec. But for that the primary tool entry providers should have matured their integration within RSec (at the level of new or updated entries submission, I mean).

jmfernandez avatar Nov 26 '25 09:11 jmfernandez

I agree that training and workflows could be better represented or represented at all.

As far as I oversee it, the ecosystem so far is tool-centric. Adding training and workflows might require more layers of IDs which could become difficult to maintain and cross-link.

If remaining with the tool-centric approach, one could relate back to TeSS via what we have in bio.tools, or asking TeSS to provide tool-based jsons. Similar to workflowhub in case that is of interest.

veitveit avatar Nov 27 '25 09:11 veitveit