glom icon indicating copy to clipboard operation
glom copied to clipboard

Typing support

Open sobolevn opened this issue 4 years ago • 13 comments

I love glom! It is easy and powerful.

The only feature I am really missing is typing support. Here are some examples that really bother me:

@dataclass
class FullNameSpec(object):
   first: str
   second: str

@dataclass
class Person(object):
    age: int
    name: FullNameSpec

And here's the usage:

glom(some_person, 'age')  # should have `int` type
glom(some_person, 'name.first')  # should have `str` type
glom(some_person, 'boom')  # should raise mypy error

There are probably other usecase like:

  • Assign when we try to assign incorrect type to a typed field
  • Mutate with the similar idea
  • Some reeally advanced T magic

What do you think about it? Is this something you would like to support?

sobolevn avatar Jun 25 '20 10:06 sobolevn

That would be great -- do you have any idea how to go about implementing this?

I guess we'd need:

1- an integration point with the type checking library (https://mypy.readthedocs.io/en/stable/extending_mypy.html#current-list-of-plugin-hooks get_function_hook() maybe?)

2- a reflection API that would let us say "first argument is of type Person, which has field name of type FullNameSpec, which has field first of type str"

Essentially we'd need to do the same operation as glom does at runtime, but instead over the type structure.

kurtbrose avatar Jun 25 '20 11:06 kurtbrose

Yes, you are right. I had experience with some mypy plugins, it should not be too hard.

Basically there's a special API to get things from SymbolTableNode: https://github.com/python/mypy/blob/95eff27adb1fae7b1ec4a7315524ebfdcebf6e2b/mypy/nodes.py#L2883

For example, for a class FullNameSpec it would have something like this:

        assert isinstance(object_type, Instance)

        sym = object_type.type.names[str_name]
        return sym.node.type  # would return anything under `str_name`

For 'age' it will return builtins.int type, etc.

sobolevn avatar Jun 25 '20 11:06 sobolevn

If we had a wrapper around the type API which could emulate the real API but return types instead of values that would be a step in the right direction -- I don't think having a "shadow" version of the whole library just for type checking would be maintainable, so we'd need to make a "test" object that could float through the system

Like....

class TypeWrap:
   def __init__(self, object_type):
      self.object_type = object_type
   def __getattr__(self, attr):
      return TypeWrap(self.object_type.type.names[str_name].node.type)

def get_function_hook(...):
    return glom(TypeWrap(input), spec).object_type

that seems way over simplified but possibly?

it presumes that we have the concrete spec though -- would the mypy plugin be able to see 'name.first' in your example, or would it just see "second argument = str"?

kurtbrose avatar Jun 25 '20 11:06 kurtbrose

huh, actually with https://github.com/mahmoud/glom/pull/94 about to land -- this could have some very interesting interactions with typing

glom(val, Match(Or({str: int}, [int]))

That would assert that val is either a dict with str keys and int vals, or a list with int vals.

Meaning, even if val's type is unknown going in, leaving that call it will either be one of those types or an exception will be thrown.

kurtbrose avatar Jun 25 '20 13:06 kurtbrose

@kurtbrose sadly, that's not how it works. Here's quite a revelant example: mypy plugin for django models: https://github.com/typeddjango/django-stubs/blob/master/mypy_django_plugin/django/context.py#L127

sobolevn avatar Jun 25 '20 13:06 sobolevn

oof 50% of that method (6 of 13 lines) is calls to other methods and helper functions

hard to follow what's going on there

are you saying that this function couldn't be implemented?

def get_attr_type(object_type, attr):
   """return the typing signature of object_type.attr"""
    # ...

kurtbrose avatar Jun 25 '20 14:06 kurtbrose

I will probably drop a simple prototype, it would be easier to discuss the existing code 🙂

sobolevn avatar Jun 25 '20 16:06 sobolevn

Great! Looking forward to the prototype. I've always suspected putting a return type on a glom call would be quite challenging, but I'm very open to pleasant surprises :)

mahmoud avatar Jun 25 '20 17:06 mahmoud

fantastic! I think this could be the start of a great collaboration :-)

kurtbrose avatar Jun 25 '20 17:06 kurtbrose

Super-early prototype is done.

sobolevn avatar Jun 26 '20 18:06 sobolevn

okay so I guess this is the key part:


class _GlomPlugin(Plugin):
    def get_function_hook(self, fullname: str) -> MypyType:
        if fullname == 'glom.core.glom':
            def test(ctx: FunctionContext) -> MypyType:
                print(ctx)
                print(ctx.arg_types[0][0], ctx.arg_types[1][0])
                print(ctx.arg_types[1][0].last_known_value.value)
                return ctx.api.expr_checker.analyze_external_member_access(
                    ctx.arg_types[1][0].last_known_value.value,
                    ctx.arg_types[0][0],
                    ctx.context,
                )
            return test
        return None

Glom itself would definitely need to be imported by the type-checker, and it would need access to the actual value of the spec.

Given the complexity of the API, this might need to wait until glom has it's own visit/compile API in place. (That is, glom may need to derive it's own type map internally and then translate that rather than running the whole thing in terms of mypy abstractions.)

kurtbrose avatar Jun 26 '20 20:06 kurtbrose

Just saw the PR, thanks, @sobolevn! Really clarifies some things.

@kurtbrose Yeah I can see that. Sounds like another vote for glompile()! (to be clear, @sobolevn, Kurt and I have sketched this compile step, we just haven't actually pushed the button on writing it yet).

mahmoud avatar Jun 26 '20 20:06 mahmoud

once matching merges, scanning for Match specs and translating them to typing is probably the low hanging fruit

glom(val, ( ... , Match([int])) )

completely disregarding val and the rest of the spec, Match([int]) at the end tells us this spec is guaranteed to return a list of integers

kurtbrose avatar Jun 28 '20 20:06 kurtbrose