timeseriescv icon indicating copy to clipboard operation
timeseriescv copied to clipboard

Tests fail

Open Gunther-Schulz opened this issue 4 years ago • 6 comments

Hello, in case you are not aware of it, but all your tests at https://github.com/sam31415/timeseriescv/blob/master/timeseriescv/tests/test_cross_validation.py are failing.

I was hoping to use the tests as a basis to understand how the library is working. Thx!

Gunther-Schulz avatar Apr 04 '20 16:04 Gunther-Schulz

@Gunther-Schulz Hi. I think this codes are no more maintained judging from recent the authors activity. So I forked the repository to pass test codes and solve deprecated feature.

r-matsuzaka avatar Jan 16 '22 11:01 r-matsuzaka

The error I got.

ryo@ryo-PC:~/timeseriescv$ pytest timeseriescv/tests/ ================================================================= test session starts ================================================================= platform linux -- Python 3.8.8, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 rootdir: /home/ryo/timeseriescv plugins: dash-2.0.0, anyio-3.4.0 collected 9 items

timeseriescv/tests/test_cross_validation.py F.FFF.... [100%]

====================================================================== FAILURES ======================================================================= _________________________________________________________ TestPurgedWalkForwardCV.test_split __________________________________________________________

self = <timeseriescv.tests.test_cross_validation.TestPurgedWalkForwardCV testMethod=test_split>

def test_split(self):
    """
    Apply split to the sample described in the docstring of prepare_time_inhomogeneous_cv_object with n_splits = 5.
    Inspection shows that the pairs test-train sets should respectively be
    1. Train: [0 : 12], test: [13 : 16] (Sample 12 purged from the train set.)
    2. Train: [0 : 16], test: [16, 17]
    3. Train: [0 : 18], test: [18 : 21]
    """
    cv = PurgedWalkForwardCV(n_splits=5)
    prepare_time_inhomogeneous_cv_object(cv)
    count = 0
    for train_set, test_set in cv.split(cv.X, pred_times=cv.pred_times, eval_times=cv.eval_times,
                                        split_by_time=True):
        count += 1
        if count == 1:
            result_train = np.arange(12)
            result_test = np.arange(13, 16)
          self.assertTrue(np.array_equal(result_train, train_set))

E AssertionError: False is not true

timeseriescv/tests/test_cross_validation.py:117: AssertionError __________________________________________________________ TestCombPurgedKFoldCV.test_split ___________________________________________________________

self = <timeseriescv.tests.test_cross_validation.TestCombPurgedKFoldCV testMethod=test_split>

def test_split(self):
    """
    Apply split to the sample described in the docstring of prepare_time_inhomogeneous_cv_object, with n_splits = 4
    and n_test_splits = 2. The folds are [0 : 6], [6 : 11], [11 : 16], [16 : 21]. We use an embargo of zero.
    Inspection shows that the pairs test-train sets should respectively be
    [...]
    3. Train: folds 1 and 4, samples [0, 1, 2, 3, 4, 16, 17, 18, 19, 20]. Test: folds 2 and 3, samples [6, 7, 8, 9,
     10, 11, 12, 13, 14, 15]. Sample 5 is purged from the train set.
    4. Train: folds 2 and 3, samples [7, 8, 9, 10, 11, 12, 13, 14, 15]. Test: folds 1 and 4, samples [0, 1, 2, 3, 4,
     5, 16, 17, 18, 19, 20]. Sample 6 is embargoed.
    [...]
    """
    cv = CombPurgedKFoldCV(n_splits=4, n_test_splits=2)
    prepare_time_inhomogeneous_cv_object(cv)
    count = 0
    for train_set, test_set in cv.split(cv.X, pred_times=cv.pred_times, eval_times=cv.eval_times):
        count += 1
        if count == 3:
            result_train = np.array([0, 1, 2, 3, 4, 16, 17, 18, 19, 20])
            result_test = np.array([6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
          self.assertTrue(np.array_equal(result_train, train_set))

E AssertionError: False is not true

timeseriescv/tests/test_cross_validation.py:153: AssertionError ________________________________________________________ TestComputeFoldBounds.test_by_samples ________________________________________________________

self = <timeseriescv.tests.test_cross_validation.TestComputeFoldBounds testMethod=test_by_samples>

def test_by_samples(self):
    """
    Use a 10 sample set, with 5 folds. The fold left bounds are at 0, 2, 4, 6, and 8.
    """
    cv = PurgedWalkForwardCV(n_splits=5)
    prepare_cv_object(cv, n_samples=10, time_shift='120m', randomlize_times=False)
    result = [0, 2, 4, 6, 8]
  self.assertEqual(result, compute_fold_bounds(cv, False))

E AssertionError: Lists differ: [0, 2, 4, 6, 8] != [0, 6, 12, 18, 24] E
E First differing element 1: E 2 E 6 E
E - [0, 2, 4, 6, 8] E + [0, 6, 12, 18, 24]

timeseriescv/tests/test_cross_validation.py:185: AssertionError _________________________________________________________ TestComputeFoldBounds.test_by_time __________________________________________________________

self = <timeseriescv.tests.test_cross_validation.TestComputeFoldBounds testMethod=test_by_time>

def test_by_time(self):
    """
    Create a sample set as described in the docstring of prepare_time_inhomogeneous_cv_object. Inspection shows
    that the fold left bounds are at 0, 7, 13, 16, 18.
    """
    cv = PurgedWalkForwardCV(n_splits=5)
    prepare_time_inhomogeneous_cv_object(cv)
    result = [0, 7, 13, 16, 18]
  self.assertTrue(all(result[i] == compute_fold_bounds(cv, True)[i] for i in range(5)))

E AssertionError: False is not true

timeseriescv/tests/test_cross_validation.py:195: AssertionError ================================================================== warnings summary =================================================================== timeseriescv/tests/test_cross_validation.py:3 /home/ryo/timeseriescv/timeseriescv/tests/test_cross_validation.py:3: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead. import pandas.util.testing as tm

-- Docs: https://docs.pytest.org/en/stable/warnings.html =============================================================== short test summary info =============================================================== FAILED timeseriescv/tests/test_cross_validation.py::TestPurgedWalkForwardCV::test_split - AssertionError: False is not true FAILED timeseriescv/tests/test_cross_validation.py::TestCombPurgedKFoldCV::test_split - AssertionError: False is not true FAILED timeseriescv/tests/test_cross_validation.py::TestComputeFoldBounds::test_by_samples - AssertionError: Lists differ: [0, 2, 4, 6, 8] != [0, 6,... FAILED timeseriescv/tests/test_cross_validation.py::TestComputeFoldBounds::test_by_time - AssertionError: False is not true ======================================================= 4 failed, 5 passed, 1 warning in 0.40s ========================================================

r-matsuzaka avatar Jan 16 '22 11:01 r-matsuzaka

I am not sure this code works...

r-matsuzaka avatar Jan 16 '22 11:01 r-matsuzaka

@r-matsuzaka can you provide a link to a fork which passes the tests and gives a right implementation? I think, the version of @sam31415 is not fully correct..

@sam31415 can you clarify why your tests fail? Is it because the test case is wrong or the implementation itself?

ts-00 avatar Feb 15 '22 19:02 ts-00

Hi! I am not maintaining this package anymore, unfortunately. I don't know why the tests aren't passing, they used to. But I'm happy have a pointer to a fork on the main page if one of you feel like maintaining it on a fork.

sam31415 avatar Feb 15 '22 21:02 sam31415

@ts-00 You should develop it from scratch on your own. This concept is not difficult. It is the fastest way. I did. So I don't commit to it any more.

r-matsuzaka avatar Feb 16 '22 00:02 r-matsuzaka