aeon icon indicating copy to clipboard operation
aeon copied to clipboard

[MNT] Cut down number of parameter combinations for testing

Open lmmentel opened this issue 2 years ago • 3 comments

We are currently running same tests with different parameter values that are not bringing much value in terms of coverage.

Consider this test from forecasting

@pytest.mark.parametrize("y", TEST_YS)
@pytest.mark.parametrize("fh", [*TEST_FHS, *TEST_FHS_TIMEDELTA])
@pytest.mark.parametrize("window_length", TEST_WINDOW_LENGTHS)
@pytest.mark.parametrize("step_length", TEST_STEP_LENGTHS)
@pytest.mark.parametrize("initial_window", TEST_INITIAL_WINDOW)
def test_sliding_window_splitter_with_initial_window(
    y, fh, window_length, step_length, initial_window
):

and the corresponding parameter sets:

TEST_OOS_FHS = [1, np.array([2, 5], dtype="int64")]  # out-of-sample
TEST_INS_FHS = [
    -3,  # single in-sample
    np.array([-2, -5], dtype="int64"),  # multiple in-sample
    0,  # last training point
    np.array([-3, 2], dtype="int64"),  # mixed in-sample and out-of-sample
]
TEST_FHS = [*TEST_OOS_FHS, *TEST_INS_FHS]
TEST_OOS_FHS_TIMEDELTA = [
    [pd.Timedelta(1, unit="D")],
    [pd.Timedelta(2, unit="D"), pd.Timedelta(5, unit="D")],
]  # out-of-sample
TEST_INS_FHS_TIMEDELTA = [
    pd.Timedelta(-3, unit="D"),  # single in-sample
    [pd.Timedelta(-2, unit="D"), pd.Timedelta(-5, unit="D")],  # multiple in-sample
    pd.Timedelta(0, unit="D"),  # last training point
    [
        pd.Timedelta(-3, unit="D"),
        pd.Timedelta(2, unit="D"),
    ],  # mixed in-sample and out-of-sample
]
TEST_FHS_TIMEDELTA = [*TEST_OOS_FHS_TIMEDELTA, *TEST_INS_FHS_TIMEDELTA]
TEST_WINDOW_LENGTHS_INT = [1, 5]
TEST_WINDOW_LENGTHS_TIMEDELTA = [pd.Timedelta(1, unit="D"), pd.Timedelta(5, unit="D")]
TEST_WINDOW_LENGTHS_DATEOFFSET = [pd.offsets.Day(1), pd.offsets.Day(5)]
TEST_WINDOW_LENGTHS = [
    *TEST_WINDOW_LENGTHS_INT,
    *TEST_WINDOW_LENGTHS_TIMEDELTA,
    *TEST_WINDOW_LENGTHS_DATEOFFSET,
]

TEST_INITIAL_WINDOW_INT = [7, 10]
TEST_INITIAL_WINDOW_TIMEDELTA = [pd.Timedelta(7, unit="D"), pd.Timedelta(10, unit="D")]
TEST_INITIAL_WINDOW_DATEOFFSET = [pd.offsets.Day(7), pd.offsets.Day(10)]
TEST_INITIAL_WINDOW = [
    *TEST_INITIAL_WINDOW_INT,
    *TEST_INITIAL_WINDOW_TIMEDELTA,
    *TEST_INITIAL_WINDOW_DATEOFFSET,
]

TEST_STEP_LENGTHS_INT = [1, 5]
TEST_STEP_LENGTHS_TIMEDELTA = [pd.Timedelta(1, unit="D"), pd.Timedelta(5, unit="D")]
TEST_STEP_LENGTHS_DATEOFFSET = [pd.offsets.Day(1), pd.offsets.Day(5)]
TEST_STEP_LENGTHS = [
    *TEST_STEP_LENGTHS_INT,
    *TEST_STEP_LENGTHS_TIMEDELTA,
    *TEST_STEP_LENGTHS_DATEOFFSET,
]

which generates 2592 combinations for running the test. Looking closer most of the cases have 2 value of the same type that are redundant and can be dropped without loosing any coverage.

lmmentel avatar Mar 04 '23 14:03 lmmentel

its madness

TonyBagnall avatar Mar 04 '23 20:03 TonyBagnall

I think we are in a better place now @lmmentel can we close this?

TonyBagnall avatar Oct 16 '23 07:10 TonyBagnall

Commented in #154 as well, I think we have accomplished this somewhat with the PR_TESTING setup.

MatthewMiddlehurst avatar May 08 '24 23:05 MatthewMiddlehurst

agreed, this is fixed

TonyBagnall avatar Jun 27 '24 20:06 TonyBagnall