thinc icon indicating copy to clipboard operation
thinc copied to clipboard

Extend tests to include newly-added data types

Open rmitsch opened this issue 2 years ago • 8 comments

Description

Extend tests to include data types added via relaxed signatures in https://github.com/explosion/thinc/pull/599.

Types of change

Enhancement / chore.

Checklist

  • [x] I confirm that I have the right to submit this contribution under the project's MIT license.
  • [x] I ran the tests, and all new and existing tests passed.
  • [ ] My changes don't require a change to the documentation, or if they do, I've added all required information.

In response to https://github.com/explosion/thinc/pull/599#discussion_r864553335.

rmitsch avatar Sep 09 '22 08:09 rmitsch

@danieldk Does this kind of modification to the tests reflect what you had in mind? If so, I'll go on adjusting the others that are relevant w.r.t. https://github.com/explosion/thinc/pull/599#discussion_r864553335.

rmitsch avatar Sep 09 '22 09:09 rmitsch

@danieldk Does this kind of modification to the tests reflect what you had in mind? If so, I'll go on adjusting the others that are relevant w.r.t. #599 (comment).

Looks good indeed!

I don't think this is easily possible, but maybe something we want to look into longer term -- ideally, we'd generate examples based on the type signatures. So, if the type signature would change, the generated data would change with it.

Though, I think that's probably a very large undertaking, so going though the types and trying to modify the tests accordingly seems like the best short-term solution.

danieldk avatar Sep 09 '22 11:09 danieldk

I don't think this is easily possible, but maybe something we want to look into longer term -- ideally, we'd generate examples based on the type signatures.

That's interesting - is there already a concept on how to do that? I can imagine having a function deriving input types from another function like

@pytest.mark.parametrize("cpu_ops", CPU_OPS)
@settings(max_examples=MAX_EXAMPLES * 2, deadline=None)
@given(X=strategies.generate_examples_for(ops.flatten))
def test_flatten_unflatten_roundtrip(cpu_ops: NumpyOps, X: numpy.ndarray):
    ...

where strategies.generate_examples_for() uses inspect.signature(ops.flatten) to fetch the signature and then translates Thinc types to numpy types using _Array.dtype(), and finally picks a hypothesis strategy based on the numpy type.

This sketch doesn't feel like that much work though, so I'm probably overlooking something :smile:

rmitsch avatar Sep 09 '22 14:09 rmitsch

That's interesting - is there already a concept on how to do that?

No, I don't think we have anything like that yet.

This sketch doesn't feel like a that much work though, so I'm probably overlooking something 😄

Try it 👍. As long as it's opt-in per test, we could enable them gradually. For reviewing I think it's the nicest to do this as a proof of concept for maybe two or three methods first. Avoids a lot of rewrites as a result of review comments. Then once we are happy with the design, we could apply it to more methods.

danieldk avatar Sep 13 '22 07:09 danieldk

Looked into this a bit. From Python 3.7 upwards this kind of type resolution would definitely work, 3.6 and below is a bit of a pain. To resolve function return type annotations into proper types, something like this would be necessary in 3.6 (fn is a Callable[..., Any] for a Callable returning e.g. an Union:

import inspect
from typing import _eval_type
# Extract all types in Union. This only works if those types are imported already.
types = [_eval_type(arg, globals(), globals()) for arg in inspect.signature(fn).return_annotation.__args__]

Whereas in Python 3.7+ it'd be

from typing import get_type_hints, get_args  # type_extensions in 3.7, typing in 3.8+
types = get_args(get_type_hints(fn)["return"]])

So in 3.6 we'd have to use private method and resolve the (_)ForwardReference values in inspect.signature(fn).return_annotation.__args__ to actual types, and we'd have to import all types before. Assuming there isn't any other, more elegant way that eluded me. 3.7+ returns them already in a usable format and requires only the public API.

Considering this I'd recommend implementing the type-dynamic testing once thinc drops support for 3.6. What do you think? Meaning I'd adjust the tests as suggested in the current draft and add a follow-up ticket.

rmitsch avatar Sep 15 '22 13:09 rmitsch

Considering this I'd recommend implementing the type-dynamic testing once thinc drops support for 3.6. What do you think?

Sounds completely sensible to me, we should avoid relying on internals of external dependencies.

danieldk avatar Sep 30 '22 07:09 danieldk

An alternative to waiting for 3.6 support to be dropped before implementing the tests would be to decorate them with

@pytest.mark.skipif(sys.version_info < (3, 7), reason="test tooling requires python3.7 or higher")

richardpaulhudson avatar Oct 25 '22 07:10 richardpaulhudson

Deploy request for thinc-ai pending review.

Visit the deploys page to approve it

Name Link
Latest commit 62d1a8317e861e1fd57c1c01d79dd9f67a01efc5

netlify[bot] avatar Jun 27 '23 15:06 netlify[bot]