jaxtyping icon indicating copy to clipboard operation
jaxtyping copied to clipboard

Error message could include more shape information

Open awf opened this issue 1 year ago • 5 comments

The following code passes typechecking, and runs without error

import jax
from typeguard import typechecked as typechecker
from jaxtyping import f32, u, jaxtyped

@jaxtyped
@typechecker
def standardize(x : f32["N"], eps=1e-5):
    return (x - x.mean()) / (x.std() + eps)

rng = jax.random.PRNGKey(42)

embeddings = jax.random.uniform(rng, (11,))
t1 = standardize(embeddings)

The following code currectly fails typechecking, but the message would ideally tell us why the shapes don't match

embeddings = jax.random.uniform(rng, (11,13))
t1 = standardize(embeddings)
# TypeError: type of argument "x" must be jaxtyping.array_types.f32['N']; got jaxlib.xla_extension.DeviceArray instead

This would more ideally be something like

# TypeError: type of argument "x" must be jaxtyping.array_types.f32['N']; got jaxlib.xla_extension.DeviceArray(dtype=float32,shape=(11,13)) instead

awf avatar Jul 19 '22 16:07 awf

Absolutely! I completely agree.

So at the moment this is a limitation of the current approach. The checking is performed via isinstance, which simply returns True or False, and it's then up to either typeguard or beartype to take this and turn it into an error message. This means that there isn't really any way of returning this additional information about why the isinstance check failed.

I don't have a great solution in mind for this at the moment. I'd welcome any thoughts on how to accomplish this.

patrick-kidger avatar Jul 19 '22 17:07 patrick-kidger

I see: you're doing all your work at https://github.com/google/jaxtyping/blob/35201eb189cc004276925f96e0aa6bfc469e46be/jaxtyping/array_types.py#L102, and then typeguard says

            elif not isinstance(value, expected_type):
                raise TypeError(
                    'type of {} must be {}; got {} instead'.
                    format(argname, qualified_name(expected_type), qualified_name(value)))

Hmmm.

So it turns out this isn't too noisy, as when your check fails, we are almost certainly going to error:

class _MetaAbstractArray(type):
    def __instancecheck__(cls, obj):
        if not isinstance(obj, jnp.ndarray):
            print(f'jaxtyping: {obj}:{type(obj)} is not a jnp.ndarray.')
            return False

        if cls.dtypes is not _any_dtype and obj.dtype not in cls.dtypes:
            print(f'jaxtyping: {obj} dtype ({obj.dtype}) is not in {cls.dtypes}.')
            return False

awf avatar Jul 19 '22 17:07 awf

Yeah, adding our own manual print statements might be one approach. Not super elegant of course so if we did this I'd probably add a global toggle on whether to print them out.

patrick-kidger avatar Jul 19 '22 18:07 patrick-kidger

Exactly so. It might even be a case for, ugh, an environment variable, so a usage pattern might be

% python t.py
...
Error message.
% JAXTYPING=verbose python t.py

awf avatar Jul 19 '22 18:07 awf

probably verbose should be the default? probably >90% of exceptions for a library like this one will be thrown while the dev is looking, not in some production use case where the print statement would be an issue.

that said, it probably should still print to stderr not stdout

GallagherCommaJack avatar Aug 24 '22 03:08 GallagherCommaJack

Hi @patrick-kidger - any updates on this? Feels like this makes jaxtyping a bit frustrating to use with a typechecker since shape mismatches are so common

dkamm avatar Mar 10 '23 05:03 dkamm

As it turns out, an analogous point has just been raised over on the beartype repo: https://github.com/beartype/beartype/issues/216

If beartype includes a hook for this use case, then it's possible that we could add in some nicer error messages here.

Until then, my usual recommendation is to arrange to open a debugger when things crash (e.g. pytest --pdb if using this as test time), and then just walk the strack trace looking at the object that was passed.

patrick-kidger avatar Mar 10 '23 05:03 patrick-kidger

@patrick-kidger thanks for the swift response! Crazy how that timing worked out.

Just out of curiosity, do you think patching typeguard like in torchtyping could work as a temporary solution? Not requesting to add it here but figured I'd ask since it looks complicated

dkamm avatar Mar 10 '23 06:03 dkamm

In principle, anything is possible with monkey patching :)

In practice that was a crazy solution that I'm not keen to repeat!

patrick-kidger avatar Mar 10 '23 06:03 patrick-kidger

@patrick-kidger it looks like typeguard 4 is adding support for a typecheck fail callback (see for example https://github.com/agronholm/typeguard/blob/master/src/typeguard/_functions.py#L116-L144). Maybe jaxtyping could make use of this when it's released?

dkamm avatar Apr 24 '23 08:04 dkamm

Nice! Beartype also has similar plans: https://github.com/beartype/beartype/issues/235

I'd be happy to add support for either/both when they're added. In fact, maybe it's worth asking if they could standardise on an API.

patrick-kidger avatar Apr 24 '23 15:04 patrick-kidger

Coming back to this, it looks like it might take quite a bit of time for beartype/typeguard to standardize their APIs, and implement them, so I think it would be nice to implement this, even if guarded by a global flag. I am guessing that a better solution that the one with printing could be decorating functions with another decorators, that would catch exceptions related to jaxtyping, and reraise them with better messages, while still preserving the original error message. Something like this:

@jaxtyping.pretty_errors
@beartype.beartype
@jaxtyping.jaxtyped
def f():
    ...

I imagine reraising could look similar to the jax errors, so that we have a "pretty" error printed after the original trace from the typechecker. Similar to this:

BeartypeTypeHintViolation: blah blah blah / TypeError: blah blah blah

The above exception was the direct cause of the following exception:

In the function 'f' argument 'x':
expected:      Array["N",     dtype=float]
got:           Array["1,2",   dtype=float]

argument 'y':
expected: Array["", dtype=int]
got:      Array["", dtype=float]

The problem is that we will have to make a conditional based on whether the error is typeguard-raised or beartype-raised, or anything-else-raised, transform the culprit log into a unified format, and only then do a pretty printing.

When the official API is going to be implemented, we anyway will have to have a functionality for pretty printing, so implementing it beforehand does not look like a waste of work. And, even though it is an ugly (and unstable) solution, I am guessing that most of the users of jaxtyping would largely appreciate having this functionality available at hand. For example, in my case, the runtime type checking is mostly useful during prototyping/debugging, and this would save me quite a bit of time, since I would only need to take a quick look at the trace instead of inserting jax.debug.print("{x}", x=x) in the place where 'f' is called from.

knyazer avatar Oct 01 '23 14:10 knyazer

FWIW we ended up implementing a small wrapper that does that for typeguard, it is literally ~30 lines of code (of which only 2 lines are typeguard specific, 15 lines do pretty printing, and the rest just boiler plate and comments) so instead of using

@jt.jaxtyped

we just use

@util.jaxtyped

marksandler2 avatar Oct 01 '23 17:10 marksandler2