arkouda
arkouda copied to clipboard
Research Coverage Tools for Testing
There have been instances of functionality breaking and not being caught by testing because no testing exists for the feature.
coverage.py
has been recommended as an option for verifying that a minimum percentage of our code is tested. This is standard practice to ensure that testing is hitting the majority of code.
We need to look into the configuration and options available for coverage.py
. Also, look into adding this to the CI to ensure new functionality is tested.
Just noting the best way to run
coverate run -m pytest tests/<test_file>.py && coverage report -m
I think this is a great idea! I know coverage.py
is used by black
from the coverage github
It uses the code analysis tools and tracing hooks provided in the Python standard library to determine which lines are executable, and which have been executed
it looks like all we need to do is pip install coverage
and we can check our test coverage by running
coverage run -m pytest -c pytest.ini
And we can use coverage report
to get a nicer breakdown of coverage per file and which lines are not covered
https://coverage.readthedocs.io/en/6.4.1/
We might also want to try out codecov
which is used by pandas
and pytest
for their coverage. I'm wary though because it has a pricing page, it says it's free for open source though
Also , noting that we can conda install coverage
We will most likely want to add .coverage
to our .gitignore
.
if we use coverage html
we need to add htmlcov/
to our .gitignore
We can use coverage run --source arkouda/ -m pytest -c pytest.ini
to only monitor coverage in the arkouda/
directory. This will prevent the tests
directory from being tracked.
When running the report we can run coverage html --omit arkouda/_version.py,arkouda/__init__.py
and this will exclude _version.py
and __init__.py
from the resulting html report
Using this shows we have 76% coverage across files in the arkouda
directory that contain our code
@pierce314159, @joshmarshall1 , and I discussed this a bit more yesterday. The main issue that we are seeing is that it does not actually solve our main issue. Ensuring code that can be run on multiple types is tested against all types. While it is important to validate that all of our code is being tested, we really need to go one step further and be able to validate that sections of code running against multiple types are tested against each type.