sparse icon indicating copy to clipboard operation
sparse copied to clipboard

Add hypothesis tests

Open sayandip18 opened this issue 3 years ago • 26 comments

Added hypothesis tests. The sparse test suite now has both pytest and hypothesis.

See #163

sayandip18 avatar May 05 '21 21:05 sayandip18

My comment on Gitter was more about moving the existing test suite.

I intended this to be a starting point since this is the first time I'm using Hypothesis. I'll keep pushing more changes.

sayandip18 avatar May 06 '21 07:05 sayandip18

On the next commit, CI should trigger, I just pushed an Azure Pipelines config that should allow it.

hameerabbasi avatar May 11 '21 10:05 hameerabbasi

Changed base to master, since this is generally useful.

hameerabbasi avatar May 11 '21 11:05 hameerabbasi

Do we intend to replace all instances of pytest.mark.parametrize with hypothesis.given?

There are some inherent problems regarding the randomized nature of Hypothesis. Consider this example https://github.com/pydata/sparse/blob/146f1129835547f2efb6ba8a55eb4eca320cbcdd/sparse/tests/test_dok.py#L59-L75

The hypothesis strategy I implemented looks like this

@given(
    st.one_of(
        st.tuples(
            st.tuples(st.integers(min_value=1, max_value=5)),
            st.dictionaries(
                st.integers(min_value=0, max_value=0),
                st.integers(min_value=0, max_value=4),
            ),
        ),
        st.tuples(
            st.tuples(
                st.integers(min_value=1, max_value=5),
                st.integers(min_value=1, max_value=5),
            ),
            st.dictionaries(
                st.tuples(
                    st.integers(min_value=0, max_value=0),
                    st.integers(min_value=0, max_value=0),
                ),
                st.integers(min_value=0, max_value=4),
            ),
        ),
    )
)
def test_construct(sd):
    shape, data = sd
    s = DOK(shape, data)
    x = np.zeros(shape, dtype=s.dtype)

    for c, d in data.items():
        x[c] = d

    assert_eq(x, s)

The strategy is very complex and its randomized nature ensures that I cannot put max_value > 0 in any of the st.dictionaries to avoid error. Is there some simpler way out?

sayandip18 avatar May 12 '21 15:05 sayandip18

Perhaps you could post the error that happens and we can work it out together?

hameerabbasi avatar May 12 '21 16:05 hameerabbasi

But if I understand correctly, you can use basic_indices, perhaps.

hameerabbasi avatar May 12 '21 17:05 hameerabbasi

Combined with composite strategies.

hameerabbasi avatar May 12 '21 17:05 hameerabbasi

The error message is quite large. The brief one is

============================================================== short test summary info ==============================================================
FAILED sparse/tests/test_dok.py::test_construct - IndexError: Index is not smaller than dimension 1 >= 1

I am getting an out of bound index due to the randomness.

sayandip18 avatar May 12 '21 17:05 sayandip18

I would define a two-way composite strategy:

Generate a tuple of shape, data:

  1. Generate a shape
  2. Generate a dictionary of unique indices for the keys and floats for the values.

hameerabbasi avatar May 12 '21 17:05 hameerabbasi

Getting a new issue

I replaced this https://github.com/pydata/sparse/blob/146f1129835547f2efb6ba8a55eb4eca320cbcdd/sparse/tests/test_dok.py#L103-L110 with the following

@given(data())
def test_getitem(data):
    n = st.integers(min_value=1, max_value=5)
    shape = st.tuples(n, n)
    density = st.floats(min_value=0, max_value=1)
    print(type(density))
    indices = st.slices(st.integers(min_value=1, max_value=3))
    s = sparse.random(shape, density, format="dok")
    x = s.todense()

    sparse_sliced = s[indices]
    dense_sliced = x[indices]

    assert_eq(sparse_sliced.todense(), dense_sliced)

which is leading to the following error

    
        if density is None:
            density = 0.01
>       if not (0 <= density <= 1):
E       TypeError: '<=' not supported between instances of 'int' and 'LazyStrategy'

sparse/_utils.py:151: TypeError

FAILED sparse/tests/test_dok.py::test_getitem - TypeError: '<=' not supported between instances of 'int' and 'LazyStrategy'

sayandip18 avatar May 15 '21 17:05 sayandip18

Don't be afraid of having CI red on a WiP PR. Just push up your code and it's easier to pull it down and take a look.

hameerabbasi avatar May 15 '21 18:05 hameerabbasi

Right now I think CI will fail because hypothesis isn't a test dependency. Please add it to https://github.com/pydata/sparse/blob/master/requirements/tests.txt as per https://github.com/pydata/sparse/pull/472#discussion_r626936838.

hameerabbasi avatar May 20 '21 12:05 hameerabbasi

A lot of tests are failing now regardless of hypothesis being there in test.txt. Most are Flaky, meaning they are exceeding the deadline of 200.00ms on the first run.

sayandip18 avatar May 20 '21 12:05 sayandip18

Try this setting in Hypothesis

hameerabbasi avatar May 20 '21 12:05 hameerabbasi

Note: I have added the utility file @H4R5H1T-007

sayandip18 avatar May 22 '21 12:05 sayandip18

Only a handful left now. I am hoping to finish this before the official coding period starts.

sayandip18 avatar Jun 04 '21 22:06 sayandip18

I will squash the commits if the tests pass. The commit message should be something meaningful instead of how it is now.

sayandip18 avatar Jul 06 '21 23:07 sayandip18

@sayandip18 Two things remain:

  1. The tests run forever. Right now it hangs on sparse/tests/test_compressed.py::test_reshape_2.
  2. There are a lot of sparse.random in the tests that should be replaced by gen_sparse_random.

hameerabbasi avatar Jul 06 '21 23:07 hameerabbasi

Can we replace the tmp_path fixture with this?

Replace

def test_abc(tmp_path):
    ...

with

def test_abc():
    with tempfile.TemporaryDirectory() as temp_path:
        ...

hameerabbasi avatar Jul 12 '21 07:07 hameerabbasi

Can we replace the tmp_path fixture with this?

Replace

def test_abc(tmp_path):
    ...

with

def test_abc():
    with tempfile.TemporaryDirectory() as temp_path:
        ...

This still needs doing, if I'm not mistaken.

hameerabbasi avatar Jul 14 '21 05:07 hameerabbasi

I'm getting these two failures for test_io.py

===================================================================== FAILURES =====================================================================
_____________________________________________________________ test_save_load_npz_file ______________________________________________________________

    @given(
>       compression=st.sampled_from([True, False]), format=st.sampled_from(["coo", "gcxs"])
    )
    def test_save_load_npz_file(compression, format):

sparse/tests/test_io.py:13: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

compression = True, format = 'coo'

    @given(
        compression=st.sampled_from([True, False]), format=st.sampled_from(["coo", "gcxs"])
    )
    def test_save_load_npz_file(compression, format):
        with tempfile.TemporaryDirectory() as tmp_path:
            x = sparse.random((2, 3, 4, 5), density=0.25, format=format)
            y = x.todense()
    
>           filename = tmp_path / "mat.npz"
E           TypeError: unsupported operand type(s) for /: 'str' and 'str'

sparse/tests/test_io.py:20: TypeError
-------------------------------------------------------------------- Hypothesis --------------------------------------------------------------------
Falsifying example: test_save_load_npz_file(
    compression=True, format='coo',
)
_________________________________________________________ test_load_wrong_format_exception _________________________________________________________

    def test_load_wrong_format_exception():
        with tempfile.TemporaryDirectory() as tmp_path:
            x = np.array([1, 2, 3])
    
>           filename = tmp_path / "mat.npz"
E           TypeError: unsupported operand type(s) for /: 'str' and 'str'

sparse/tests/test_io.py:31: TypeError

sayandip18 avatar Jul 14 '21 07:07 sayandip18

How about doing with ... as tmp_path_str and then tmp_path = pathlib.Path(tmp_path_str)?

hameerabbasi avatar Jul 14 '21 07:07 hameerabbasi

Passed!

sayandip18 avatar Jul 14 '21 07:07 sayandip18

Codecov Report

Merging #472 (92f8b54) into master (2e3d84d) will decrease coverage by 0.27%. The diff coverage is n/a.

:exclamation: Current head 92f8b54 differs from pull request most recent head 0914c09. Consider uploading reports for the commit 0914c09 to get more accurate results

@@            Coverage Diff             @@
##           master     #472      +/-   ##
==========================================
- Coverage   95.29%   95.02%   -0.28%     
==========================================
  Files          20       20              
  Lines        3019     3015       -4     
==========================================
- Hits         2877     2865      -12     
- Misses        142      150       +8     

codecov[bot] avatar Aug 22 '21 23:08 codecov[bot]

I think you need to merge master.

hameerabbasi avatar Aug 23 '21 05:08 hameerabbasi

@sayandip18 Do you plan on continiuing this?

hameerabbasi avatar Jan 05 '24 06:01 hameerabbasi