xdem icon indicating copy to clipboard operation
xdem copied to clipboard

Disable `numba` during CI tests to get adequate coverage stats

Open rhugonnet opened this issue 1 year ago • 3 comments

Ongoing

rhugonnet avatar Apr 20 '23 22:04 rhugonnet

Will this make the tests run much slower?

On a related note, I'll soon add an issue on geoutils/xdem that we should prefer synthetic tests instead of the Longyearbyen examples for many tests. The real-world tests absolutely have value, but much of the functionality could easily be tested with a small made up numpy array pair that would speed up testing by a lot. Again, I'll add issues when I'm able to spend time on implementing it!

erikmannerfelt avatar Apr 25 '23 07:04 erikmannerfelt

I also see in the tests that numba complains about this. We could make a custom jit decorator that can take a local or global disable argument. Something like:


def jit(func: Callable[Any, Any], disable: bool = DISABLE_NUMBA, numba_kwargs: dict[str, Any] = None):
    if disable:
        return func
    numba_kwargs = numba_kwargs if numba_kwargs is not None else {}
    return numba.njit(func, **numba_kwargs)

(I haven't tested this code yet)

I think it's functools that helps with documentation and type inheritance of decorators, but this might already be fixed by the fact that we rarely (or at all?) use JITed functions directly in userspace (it's wrapped by a 100% python function).

erikmannerfelt avatar Apr 25 '23 07:04 erikmannerfelt

Will this make the tests run much slower?

Yes it would, we do need to make test data smaller!

On a related note, I'll soon add an issue on geoutils/xdem that we should prefer synthetic tests instead of the Longyearbyen examples for many tests. The real-world tests absolutely have value, but much of the functionality could easily be tested with a small made up numpy array pair that would speed up testing by a lot. Again, I'll add issues when I'm able to spend time on implementing it!

Fully agree, this is a great idea. It is not enough, but we did it a bit in recent additions of test_raster: https://github.com/GlacioHack/geoutils/blob/main/tests/test_raster.py#L2848 Also sometimes setting a fixed random mask that has a least one nodata value: https://github.com/GlacioHack/geoutils/blob/main/tests/test_raster.py#L423

But I think it should be done consistently everywhere (outside of a test class, and fed to every single test like the examples are)! We could have 2/3 synthetic DEMs (different projections, nodata, some with masked values, some without). And maybe 2/3 real DEMs but cropped to a small extent to increase the processing speed (probably useful to test things like floating point precision, specific patterns of nodata, etc). :slightly_smiling_face:

rhugonnet avatar Apr 25 '23 18:04 rhugonnet