how to customize steps before each workflow test
the pytest.mark.workflow has been really useful for customizing steps after the workflow is run, but i'm wondering if pytest-workflow has any way to run steps before the workflow is run as well, similar to pytest setups using pytest.fixtures. In our use case, it would be to perform a git read-tree operation to give our pytest access to packages from a remote repo during the test, but since we do not want to maintain a link to those modules in our repo directly, it would be great to stage them in each pytest_workflow.
of course, in our use case it would also be helpful if we can use the workflow_dir fixture to do this, but i don't know if that's possible or necessary.
it would be to perform a git read-tree operation to give our pytest access to packages from a remote repo during the test, but since we do not want to maintain a link to those modules in our repo directly
What kind of packages, can you specify the use case a bit further? Maybe provide some links to what you are doing?
maybe modules is a better way to describe it than packages. we are creating a modules-type repo similar to nf-core/modules, and users will be able to install components from this repo and our repo into their pipeline-type repo. we want our modules repo to be stand-alone, but we also want some of our subworkflows to be able to use some of the publicly available components at nf-core/modules. We think the easiest way to run the test is to do something like:
# setup
git remote add -f -t master --no-tags nf-core-repo https://github.com/nf-core/modules.git
git read-tree --prefix=modules/nf-core/ -u nf-core-repo/master:modules/nf-core/
# do pytest-workflow as normal
# tear-down
git rm -rf modules/nf-core/
git remote remove nf-core-repo
we could in theory maintain the nf-core/modules components in the repo long-term but we're wondering if there's a way to avoid that.
i think it would be nice to understand how to run any setup steps, with pytest-workflow, if possible.
it might be helpful to add, that in our standalone repo we want to use something very similar to the following file: https://github.com/nf-core/modules/blob/master/tests/test_versions_yml.py
just want to build on this to ensure the cross-repo functionality that we're aiming for.
i think i figured something out with pytest_sessionstart and pytest_sessionfinish in a conftest.py file. sorry if i wasted your time, i am really unfamiliar with pytests and was racking my brain all yesterday! it might be slightly unfortunate that the remote repo files are not cleaned up in the pytest workflow directory (despite being cleaned up in my codebase), but i think that's ok because it's meant as a temp directory anyways.
I understand, the pytest API is not easy to grasp! pytest_sessionstart and pytest_sessionfinish would indeed work. I think the cleanup might be done anyway because it will call these functions for each module.
Were you able to solve your problem?
i would like to clear the database between workflow runs, is there anyway to do this?
anyone coming here, this worked for me:
from pytest_workflow.workflow import Workflow
I understand, the pytest API is not easy to grasp!
pytest_sessionstartandpytest_sessionfinishwould indeed work. I think the cleanup might be done anyway because it will call these functions for each module. Were you able to solve your problem?
I think this is an ok solution, but i am a bit nervous about frequently running these operations in the same directory where my code is being developed. i would imagine that if i killed a pytest run during execution, pytest_sessionfinish would not execute and in my use case i'd be left with files in my git index that i don't want there. i would rather have a pytest fixture that can run in the workflow_dir where the rest of the pytest will actually run... but having thought about this a while longer i can imagine why that might not exist yet or be challenging to implement when there are other solutions available.
I found this thread looking for the same functionality. What I want to do is not store workflow inputs in plain text but generate them programmatically so it is easier to maintain. For example, something like the following pseudo-code:
@pytest.fixture
@pytest.mark.pre_workflow('Test workflow on a simple text input')
def input_for_test_workflow_on_a_simple_text_input(workflow_dir: Path) -> None:
with open(workflow_dir) / "input.txt" as handle:
for number in (1, 2, 3):
handle.write(f"{number}\n")
- name: Test workflow on a simple text input
command: awk '{print $1*3}' input.txt > output.txt
files:
- path: output.txt
@pytest.mark.workflow('Test workflow on a simple text input')
def test_workflow_on_a_simple_text_input(workflow_dir: Path):
with open(workflow_dir / "output.txt") as handle:
assert map(int, handle.readlines()) == [3, 6, 9]
Do you think this is possible? I spent some time reviewing the plugin code and I think I'd need to find a way to get the fixture/mark attached to the WorkflowTestsCollector in such a way that the pure function/fixture could be run after the workflow directory is created but before the workflow is enqueued. Is this conceptually how it would need to work? Or am I pointed in the wrong direction?
Sorry for my late reply.
I found this thread looking for the same functionality. What I want to do is not store workflow inputs in plain text but generate them programmatically so it is easier to maintain.
We went down that route before we made pytest-workflow. The problem was that when we changed inputs we also had to change the code. The code might have bugs that caused the thing to crash. Having it in plaintext is much more clear. In case you wish more programmatic stuff, you could take a look at YAML anchors. That way you can avoid a lot of the repitition.
If you then still feel such advanced functionality is needed, I think there are plenty of other test frameworks out there that do follow this philosophy. If I were to move pytest-workflow in this direction it would be in violation of one of the core design goals: keeping tests as very simple and quick to write YAML files.
Thanks @rhpvorderman, I appreciate the context and I also respect you making difficult decisions to keep the project true to its values. It's still a wonderful and useful tool for many use cases!