clkhash icon indicating copy to clipboard operation
clkhash copied to clipboard

Numeric overlap tests failing

Open hardbyte opened this issue 1 year ago • 0 comments

With recent changes to hypothesis, the two test_numeric_overlaps tests are now regularly failing. An example failure:

    @given(thresh_dist=integers(min_value=1),
           resolution=integers(min_value=1, max_value=512),
           candidate=integers())
    def test_numeric_overlaps_with_integers(thresh_dist, resolution, candidate):
        comp = NumericComparison(threshold_distance=thresh_dist, resolution=resolution, fractional_precision=0)
        other = candidate + thresh_dist
        cand_tokens = comp.tokenize(str(candidate))
        other_tokens = comp.tokenize(str(other))
        if other != candidate:
            assert len(set(cand_tokens).intersection(
                set(other_tokens))) == 1, "numbers exactly thresh_dist apart have 1 token in common"
        other = candidate + thresh_dist + int(math.ceil(thresh_dist/2))
        other_tokens = comp.tokenize(str(other))
        assert len(set(cand_tokens).intersection(
            set(other_tokens))) == 0, "numbers more than thresh_dist apart have no tokens in common"
        modulus = int(thresh_dist / (2 * resolution))
        if modulus > 0:
            other = candidate + random.randrange(modulus)
            other_tokens = comp.tokenize(str(other))
            assert len(set(cand_tokens).intersection(
                set(other_tokens))) >= len(
                cand_tokens) - 2, "numbers that are not more than the modulus apart have all or all - 2 tokens in common"
    
        if thresh_dist < 20:
            numbers = [candidate + i for i in range(thresh_dist + 10)]
        else:
            numbers = [candidate + int(thresh_dist * (i * 0.1)) for i in range(20)]
        def overlap(other):
            other_tokens = comp.tokenize(str(other))
            return len(set(cand_tokens).intersection(set(other_tokens)))
        overlaps = [overlap(num) for num in numbers]
        assert overlaps[0] == len(cand_tokens)
>       assert overlaps[-1] == 0
E       assert 1 == 0

tests/test_comparators.py:230: AssertionError
---------------------------------- Hypothesis ----------------------------------
Falsifying example: test_numeric_overlaps_with_integers(
    thresh_dist=19, resolution=1, candidate=-52,
)

You can reproduce this example by temporarily adding @reproduce_failure('6.43.3', b'AAAAJAAAAABp') as a decorator on your test case

hardbyte avatar May 17 '23 04:05 hardbyte