geo-deep-learning icon indicating copy to clipboard operation
geo-deep-learning copied to clipboard

[feature request] Use pytest fixtures with scope="session" to access test data during testing

Open remtav opened this issue 2 years ago • 0 comments

Multiple tests in our test suite use test data under the tests/data/, referenced by different csvs in tests/tiling/ This test data could be made accessible for all tests via pytest fixtures by creating a conftest.py (reference, see "conftest.py" section)

This conftest.py could be parametrized to return requested data based on selection between 'binary' or 'multiclass', 'singleband' or 'multiband', etc. To parametrize a fixture, the fixture function can itself return a function rather than the data directly (reference).

Here's a example of what the tests/conftest.py could look like:

from typing import Dict, List

import pytest
from torchgeo.datasets.utils import extract_archive

from utils.utils import read_csv


@pytest.fixture(scope="session")
def test_data():
    def _method(classes_type: str = 'binary', bands_type: str = 'singleband', other_suffix: List = []) -> Dict:
        if not isinstance(classes_type, str) and not isinstance(bands_type, str):
            raise ValueError(f"Classes type and bands type should be strings")
        if not isinstance(other_suffix, List):
            raise ValueError(f"Additionnal suffixes should be a list of strings")
        extract_archive(src="tests/data/massachusetts_buildings_kaggle.zip")
        extract_archive(src="tests/data/new_brunswick_aerial.zip")
        extract_archive(src="tests/data/spacenet.zip")
        bands_type = "-" + bands_type if bands_type else ""
        other_suffix_str = "_" + "_".join(other_suffix) if other_suffix else ""
        csv_path = f"tests/tiling/tiling_segmentation_{classes_type}{bands_type}{other_suffix_str}_ci.csv"
        data = read_csv(csv_path)
        return data
    return _method

This fixture can then be called for any test under tests/ directory (including subdirectories). For example, in tests/utils/test_foo.py:

def test_foo(test_data):
    data = test_data(classes_type='binary', bands_type='multiband')

Before implementing this:

  • csvs referring to test data should be moved from tests/tiling to tests/data, since they're used for more than just tiling-related tests;
  • csvs should be renamed with a logic making them easy to select, for example: data_segmentation_{bands_type}_{classes_type}_{other_suffix1}_{other_suffix2}_{other_suffixn}.csv

remtav avatar Dec 07 '22 16:12 remtav