file based splits
I'm interested in file based splits in order to keep modules together
For example:
a.py
b.py
sub_module/c.py
sub_module/d.py
another_sub_module/e.py
A split might be like a.py and b.py, sub_module, and another_sub_module
Cherry on top would be being able to recover these groups by name so I can label in other CI/CD
Let's first align on the naming:
- module is a .py file
- package is a directory which contains .py files, including
__init__.py
I think the pragmatic solution would be to ensure that tests within a single module are ran in the same group. If the algorithm would also take into accounts packages, it'd easily get complex as there can be basically endless level of nesting with packages.
Considering implementation, I think there are two options:
- A new flag, e.g.
--no-module-splits. This would bring the feature to both of the splitting algorithms (duration_based_chunks and least_duration, see README and sources in algorithms.py). - A new splitting algorithm
I'd favour 1. as there are also needs for "no splits for classes" (e.g. https://github.com/jerry-git/pytest-split/issues/82) so 1. would be a better choice considering future development.
I believe there'd be many use cases for this so happy to take a PR if someone wants to give it a shot.
recover these groups by name so I can label in other CI/CD
@wd60622 what do you mean by this?
@wd60622 what do you mean by this?
I would like to be able to know which files / modules are in each group. This issue's body is the motivation: https://github.com/pymc-labs/pymc-marketing/issues/1158 where I display the groups in a GitHub Issue.
Thanks for clarifying the problem a bit @jerry-git
Are you imagining that the --splitting-algorithm flag works the same way as before?
For instance,
# Same behavior as before
pytest --splitting-algorithm=duration_based_chunks
# New behavior
pytest --splitting-algorithm=duration_based_chunks --no-module-splits
Where the --no-module-splits would be some preprocessing step on the duration before the algorithm is run.