mypy icon indicating copy to clipboard operation
mypy copied to clipboard

TypedDict Nested Dictionaries?

Open riley-martine opened this issue 6 years ago • 8 comments

Is there a way to use TypedDict for more complicated nested structures? For example, if I pass around a dictionary with certain defined keys and values, some of which are dictionaries themselves, can I validate those?

  • Are you reporting a bug, or opening a feature request? Asking a question/maybe feature request

  • Please insert below the code you are checking with mypy, or a mock-up repro if the source is private. We would appreciate if you try to simplify your case to a minimal repro.

from mypy_extensions import TypedDict

Data = TypedDict("Data", {"i": {"j": int}})

def access(data: Data) -> None:
    data["i"]["j"] += 1

mydata = {"i": {"j": 5}}  # type: Data
access(mydata)
  • What is the actual behavior/output?
typeddict_practice.py:3: error: Invalid field type
typeddict_practice.py:7: error: TypedDict "TypedDict" has no key 'i'
typeddict_practice.py:10: error: Extra key 'i' for TypedDict "TypedDict"
  • What is the behavior/output you expect? No output

  • What are the versions of mypy and Python you are using? mypy 0.600 Python 3.6.5

  • What are the mypy flags you are using? (For example --strict-optional) None

I can use something like Data = TypedDict("Data", {"i": Dict[str, int]}), but that doesn't validate keys past the first level.

I can also do

Intermediate = TypedDict("Intermediate", {"j": int})
Data = TypedDict("Data", {"i": Intermediate})

but this is cumbersome, especially when there are a ton of fields. As #4299 notes, you can't nest TypedDicts either. Is there a better way?

riley-martine avatar Jun 05 '18 15:06 riley-martine

But why don't you like sequential definitions? As PEP 20 says flat is better than nested. Also it would make sense to give meaningful names to sub-dictionaries (after all something gave you the idea to use a nested dict instead of a flat one), you can potentially re-use some of them. For example:

class UserData(TypedDict):
    name: str
    int: id

class Request(TypedDict):
    request: bytes
    user: UserData

class Response(TypedDict):
    header: bytes
    load: List[bytes]
    user: UserData

I could imagine sometimes you may want a one-off TypedDict, but IMO such situations are rare so this is a low priority.

ilevkivskyi avatar Jun 06 '18 13:06 ilevkivskyi

But why don't you like sequential definitions?

I started implementing with sequential definitions, and it works better than I expected. I was able to reuse some of the inner ones, and it was quite convenient. Thank you for your suggestion!

As PEP 20 says flat is better than nested.

The situation I'm dealing with is server config data returned from an API. It can be highly nested, and takes a lot of named sub-TypedDicts to work. I like the peace of mind knowing that any field accessed is guaranteed to be there and be the right type, that haven't skipped or mistyped an index, and if I create/modify the same type of structure in different places in my code, it is guaranteed to have all of the fields it should.

For example, there is one place in the code where if the server can't be reached, a stub config is returned. Adding the TypedDict code let me know this stub was missing a few fields that could have been accessed later on. Furthermore, adding Optional[]s allowed me to fix places where there wasn't code to handle the absence of a field. (e.g. calling outerdict["key"]["data"].keys() when outerdict["key"]["data"] was type Optional[Dict[str, float]])

such situations are rare

I'm not sure how rare this is, so I will defer to your judgement. Thank you for considering this idea!

riley-martine avatar Jun 06 '18 15:06 riley-martine

I have a question about nested TypedDict too.

from typing import TypedDict

class Foo(TypedDict):
    foo: str

class Bar(TypedDict):
    bar: Foo
    
data: Bar = {
    'bar': {'foo': '1'}
}

print({k: v.get('foo') for k, v in data.items()})

This code works and prints {'bar': '1'} to stdout. But mypy shows an error: error: "object" has no attribute "get" https://mypy-play.net/?mypy=latest&python=3.8&gist=b74af38f157707d1b5e5558d3ecea9bc Why does it define the type of a value, returned by .items(), as object? How could I fix this error?

ABCDeath avatar Jul 23 '20 12:07 ABCDeath

@ABCDeath You could try an overrwite to .items? its painful but its the only thought I had.

KeynesYouDigIt avatar Oct 27 '20 19:10 KeynesYouDigIt

To add to @riley-martine example of configs, I often find myself creating message-passing systems (e.g. celery, kafka etc..).

All of these need well defined message schemas that will be passed around producers and consumers. All of the messages are highly isolated one with the other. This could probably help generating avro schemas type checkers.

In those cases TypeScript-like definitions make static checkers's life (and developers') somewhat easier.

e.g.

export interface EmailFetcherEvent {
    userEmail: string,
    pageToken?: string | null,
    updateHistoryId?: boolean | false,
    parameters: Array<{
       value: number,
        name: string,
        active: boolean
    }>
}

Thank you for considering this idea!

MyPy Playground: https://mypy-play.net/?mypy=latest&python=3.9&gist=29faabf0dd252d7772d2d9e01e8ce5a4

FreCap avatar Oct 30 '20 00:10 FreCap

I guess sequential definitions make sense, when you want to / want to re-use the inner types in multiple places. Like:

class Inner(TypedDict):
    ...

class Outer1(TypedDict):
    field1: str
    inner: Inner

class Outer2(TypedDict):
    field2: str
    inner: Inner

For me, at least, this is rarely the case. Having to do flat/sequential definitions just adds unnecessary verbosity and boilerplate. If an inner type has no use outside of its context, an "in-place" placement (like in TypeScript) would be much nicer.

(I love how) Python's syntax is often short and straight-to-the-point, but this one introduces unnecessary bloat.

tuukkamustonen avatar Dec 29 '20 08:12 tuukkamustonen

Still very much a needed QoL. Something like:

Foo = TypedDict('Foo',
{
	'x': {
		'y': [str],
		'z': { 'foo': Path },
	},
})

Notice how literal lists and dicts are used as part of constructing the schema of the TypedDict, alongside otherwise normal types (str, Path, ...).

would be great. Not sure how to combine union type syntax with variable-as-types syntax (e.g. 'y': str | [str]).

munael avatar Feb 18 '21 12:02 munael

Is it feasible to implement this feature with a mypy plugin? I'm not familiar enough with the model to know whether this information is even analyzed.

gravesee avatar Aug 25 '22 10:08 gravesee

FYI seems to work in pyright https://github.com/microsoft/pyright/issues/4016

from typing import TypedDict

Geometry = TypedDict(
    "Geometry",
    {
        "position": TypedDict("Position", {"x": int, "y": int, "z": int}),
        "dimensions": TypedDict("Dimensions", {"width": int, "height": int}),
    },
)


def area(geometry: Geometry):
    return geometry["dimensions"]["width"] * geometry["dimensions"]["height"]


r = area({"position": {"x": 0, "y": 0, "z": 0}, "dimensions": {"width": 10, "height": 20}})

reveal_type(r) # Revealed type is 'builtins.int'

betafcc avatar Oct 07 '22 08:10 betafcc

@betafcc, pyright should not allow this. It's a false negative, which I will fix. Call expressions are forbidden in type annotations, so this should not be allowed.

erictraut avatar Oct 07 '22 15:10 erictraut

I'm not sure if I understand this right? Is there any reason that nested TypedDicts can't be implemented? Because even though my IDE screams at me when I write them, they work quite well.

So do you really think unnecessarily forbidding people to write code the way that THEY feel is appropriate for their projects is a good idea?

Because when I try to write a skeleton type for a complex, third party API response, so that I can safely access the sub-parts that I'm interested in, destructuring this thing and spreading it out so that any attempt to undertand the full type is an endless search-find-search adventure is not something I intend to do.

I think when it comes to "add-on type systems" TypeScript has shown what works well. And that works.

MatthiasvB avatar Oct 25 '22 06:10 MatthiasvB

Sharing another possible option to do that.

from typing import Type, TypedDict

class Job(TypedDict):
    id: int
    source: Type[TypedDict('Source', {'id': int, 'name': str})]

Although I would definitely agree, that the following format would be far more readable, if were allowed

from typing import TypedDict

class Job(TypedDict):
    id: int
    source: {'id': int, 'name': str}

When the number of nested elements increases, for human eye it is easier and more intuitive to read the option above, rather than scrolling and looking for sequential definitions, and they are redundant if only used in the context of one Job as from example above.

shams0910 avatar Nov 28 '22 14:11 shams0910

I'm another +1 for allowing nesting. As a TS dev learning Python, the verbosity and seeming difficulty involved in complex types is killing me, and to the point of making me wish I could abandon Python entirely. 🙁

While were at it:

Data = TypedDict("Data", {"some_key": some_type})

Anyone understand the point of the first argument? I've only ever seen it match the name to the left of the =. It would be great if we could just have:

Person = TypedDict({
    "name": str, 
    "pets": list[{
        "name": str, 
        "age": int
    }]
})

lobsterkatie avatar Jul 28 '23 18:07 lobsterkatie

An issue relevant to an inline TypedDict definition like this is what syntax should we have for inheritance, which will be somewhat necessary for better reusability.

PIG208 avatar Sep 15 '23 03:09 PIG208

An inlined TypedDict syntax would need to be standardized across type checkers, so this discussion probably doesn't belong in the mypy issue tracker. It would be better to move it to the python/typing discussion forum.

There has already been some discussion within the broader typing community about inlined TypedDict syntax. See this thread for details. This hasn't converged yet on a formal proposal.

I've prototyped support for an inlined TypedDict syntax within pyright with the goal of gathering feedback and informing an eventual specification. With this syntax, the above TypedDict (including a nested TypedDict definition) would be specified as follows:

# pyright: enableExperimentalFeatures=true

Person = dict[{
    "name": str, 
    "pets": list[dict[{
        "name": str, 
        "age": int
    }]]
}]

erictraut avatar Sep 15 '23 04:09 erictraut

An experimental support for this has been merged to master, you can play with it using --enable-incomplete-feature=InlineTypedDict.

ilevkivskyi avatar Jul 07 '24 10:07 ilevkivskyi