Enable parsing of multi-line config values
On Slack, @charles-plessy reported that the new release of nf-core/pairgenomealign was hindered by a failing Download test.
I have thus extended the corresponding test in tools to include the notation as used by pairgenomealign.
pytest ./tests/pipelines/test_download.py -k test__find_container_images_config_nextflow
Indeed, the pattern failed with the current code. Upon closer inspection, I could trace down the issue to nf_core.utils.fetch_wf_config() returning truncated configuration values for multi-line strings because of the current parsing that uses nfconfig_raw.splitlines().
Since the test is mocking and partially duplicating the config parsing function, I developed the new regex there and thereafter also updated the nf_core.utils.fetch_wf_config() with the new logic that is capable of multi-line parsing.
In addition, I had to tweak the parsing of the main.nf as well, since mypy would otherwise not allow me to commit my changes:
nf_core/utils.py:348: error: Incompatible types in assignment (expression has type "Match[str] | None", variable has type "Match[str]") [assignment]
PR checklist
- [x] This comment contains a description of changes (with reason)
- [x]
CHANGELOG.mdis updated - [x] If you've fixed a bug or added code that should be tested, add tests!
I don't think that any of my changes could have caused this test failure? Admittedly, I do not even know what kind of test that is...
FYI the refgenie server seems to be down. so the CI testpipelines/test_refgenie.py is expected to fail currently.
Admittedly, my changes apparently introduced an issue with the config parsing. When running
pytest tests/pipelines/test_create_app.py -k test_github
the manifest.version key is missing from the configuration:
sorted(wf_config.keys())
'manifest.contributors', 'manifest.defaultBranch', 'manifest.description', 'manifest.homePage', 'manifest.mainScript', 'manifest.name', 'manifest.nextflowVersion', 'nextflow.enable.configProcessNamesValidation'
I will need to investigate...
Ok, this looks much better now. The only test that is still failing is Refgenie as expected.
But I could fix the other test failures that indeed indicated a serious issue: I did not consider, that we might encounter empty strings as config values. Because the last group no longer had a match that resulted in an overall match satisfying the regex, the regex engine backtracked. In the (.*?)(?=(\n[^\n=]+?\s*=) part of the old regex, the first group can give up one match. The group has one iteration it can backtrack into. The second group promptly matches. The group again has one iteration, fails the next one, and the y fails. Backtracking again, the second group now has one backtracking position, reducing itself to match. The group tries a second iteration and so on: Ultimately, this lead to a catastrophic (endless) backtracking for empty strings.
Apart from that, the condition if k and v: is not satisfied if v is empty, because bool('') evaluates to false, so config[k] was never created for empty values, which ultimately resulted in the manifest.version key error.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 76.97%. Comparing base (
a490554) to head (a9450f2). Report is 8 commits behind head on dev.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.