versions.yml-generating code produce malformed files without the nf-module failing
Have you checked the docs?
Description of the bug
As described here, the versions.yml-generating code in the modules may produce malformed files without the module failing. The problem has only been observed with singularity.
Due to "instability" of the underlying system (HPC), it may happen that the call to a "tool" in a module-script finishes correctly while the subsequent call to <tool> --version in the versions.yml-generating code fails. That will result in an invalid versions.yml-file (it contains an error message) but the execution of the module hasn't failed. (That is basically due to the fact that a BASH-command like echo $(foo 2>&1) > versions.yml gives an exit code 0 regardless of the fact that the call foo may give a non-zero exit code.)
In such cases, it will not help re-running the pipeline with -resume, since the module execution didn't fail, it wouldn't be re-run and a new versions.yml-file wont be produced. (A user, who understands the problem, can fix the versions.yml-file manually, but that is not ideal.)
Locally, we've overcame the problem by (1) hardcoding the version-numbers in the versions.yml-generating code or (2) moving the versions.yml-generating code to the top of the script-section in the modules. (But those "solutions" are of course not ideal.)
The issue has been discussed here: https://nfcore.slack.com/archives/CE5LG7WMB/p1727099518643229
It was proposed that the versions.yml-generating code could perhaps be changed to something like
version=\$(iphop --version 2>&1) || exit 1
top=\$(echo "\$version" | head -1 ) || exit 1
cleaned=\$(echo "\$top" | sed 's/^.*iPHoP v//; s/: integrating.*\$//' ) || exit 1
cat <<-END_VERSIONS > versions.yml
"${task.process}":
iphop: \$cleaned
END_VERSIONS
but that doesn't work for all tools as illustrated here.
If it is decided to change the versions.yml-generating code, then I guess a module template somewhere has to be changed as well.
Command used and terminal output
No response
Relevant files
No response
System information
No response
set -e might also be helpful to write this in a nicer way.
Apparently, there's also some cases where calls are made that are expected to give non-zero return codes, those would need to be handled somehow (e.g. suffixing with || true).