Older stubs still explicitly using `Any` rather than `Incomplete`
Thanks to typeshed-stats (#9386), I was able to make the following list of stubs majorly using Any, indicating that they've likely not been switched over to Incomplete yet.
Ordered by ratio of "explicit Any / explicit Incomplete + 1" parameters and variables.
Excludes stubs marked as obsolete and ratio <= 1 rounded down.
| Package Name | Explicit Any |
Explicit Incomplete |
Ratio |
|---|---|---|---|
| WebOb | 119 | 0 | 119 |
| WTForms | 114 | 0 | 114 |
| simplejson | 59 | 0 | 59 |
| python-gflags | 50 | 0 | 50 |
| regex | 41 | 0 | 41 |
| pyflakes | 40 | 0 | 40 |
| Markdown | 33 | 0 | 33 |
| jmespath | 60 | 1 | 30 |
| toml | 30 | 0 | 30 |
| decorator | 28 | 0 | 28 |
| python-xlib | 28 | 0 | 28 |
| uWSGI | 21 | 0 | 21 |
| gevent | 114 | 5 | 19 |
| polib | 18 | 0 | 18 |
| slumber | 16 | 0 | 16 |
| six | 15 | 0 | 15 |
| PyYAML | 216 | 14 | 14 |
| pynput | 14 | 0 | 14 |
| stdlib | 3295 | 237 | 13 |
| opentracing | 25 | 1 | 12 |
| hdbcli | 24 | 1 | 12 |
| python-jose | 56 | 4 | 11 |
| fanstatic | 11 | 0 | 11 |
| Flask-Cors | 10 | 0 | 10 |
| mypy-extensions | 10 | 0 | 10 |
| protobuf | 144 | 15 | 9 |
| parsimonious | 27 | 2 | 9 |
| capturer | 9 | 0 | 9 |
| croniter | 9 | 0 | 9 |
| greenlet | 8 | 0 | 8 |
| singledispatch | 8 | 0 | 8 |
| translationstring | 8 | 0 | 8 |
| xmltodict | 8 | 0 | 8 |
| html5lib | 179 | 24 | 7 |
| paramiko | 45 | 6 | 6 |
| console-menu | 11 | 1 | 5 |
| waitress | 5 | 0 | 5 |
| python-crontab | 18 | 3 | 4 |
| cachetools | 4 | 0 | 4 |
| colorama | 4 | 0 | 4 |
| qrbill | 4 | 0 | 4 |
| Pygments | 178 | 47 | 3 |
| vobject | 93 | 24 | 3 |
| PyMySQL | 88 | 28 | 3 |
| dateparser | 86 | 21 | 3 |
| Send2Trash | 3 | 0 | 3 |
| untangle | 3 | 0 | 3 |
| usersettings | 3 | 0 | 3 |
| seaborn | 186 | 65 | 2 |
| mock | 137 | 45 | 2 |
| psycopg2 | 97 | 37 | 2 |
| beautifulsoup4 | 72 | 33 | 2 |
| aws-xray-sdk | 61 | 21 | 2 |
| humanfriendly | 54 | 22 | 2 |
| requests | 53 | 19 | 2 |
| JACK-Client | 2 | 0 | 2 |
| chevron | 2 | 0 | 2 |
| first | 2 | 0 | 2 |
| ibm-db | 2 | 0 | 2 |
| libsass | 2 | 0 | 2 |
| retry | 2 | 0 | 2 |
| tabulate | 2 | 0 | 2 |
| toposort | 2 | 0 | 2 |
| gdb | 2 | 0 | 2 |
(updated as of 2023-11-02)
We can also list 3rd party stubs by how many module-level and class-level variables they have:
for d in stubs/*/; do echo "$d,$( grep -ERoi '(^| )[[:alnum:]]+?: Any$' $d | wc -l )"; done
| Package Name | Var Any count |
|---|---|
| boto | 162 |
| html5lib | 144 |
| ldap3 | 133 |
| Pygments | 131 |
| psutil | 101 |
| redis | 92 |
| oauthlib | 85 |
| vobject | 78 |
| PyYAML | 77 |
| commonmark | 71 |
| google-cloud-ndb | 55 |
| PyMySQL | 37 |
| passlib | 35 |
| psycopg2 | 33 |
| dateparser | 32 |
| httplib2 | 32 |
| beautifulsoup4 | 27 |
| protobuf | 27 |
| aws-xray-sdk | 25 |
| humanfriendly | 20 |
| mock | 16 |
| fpdf2 | 12 |
| paramiko | 9 |
| python-jose | 9 |
| requests | 6 |
| Markdown | 5 |
| boltons | 2 |
| pep8-naming | 2 |
| pyflakes | 2 |
| python-gflags | 2 |
| WebOb | 2 |
(updated as of 2024-07-25)
Do you think it's fine doing a single PR for those (single search & replace change across many third party stubs) instead of 40-70 PRs?
Still gotta validate that those with few Any actually meant it. Bigger ones can probably be mostly blindly changed?
Do you think it's fine doing a single PR for those (single search & replace change across many third party stubs) instead of 40-70 PRs?
Maybe we could start off doing a grep for (or using a script to auto-update) function signatures with foo: Any | None = .... We can be pretty confident that those are artefacts of stubgen, so we can probably update all of those to foo: Incomplete | None = ... in a bulk PR pretty safely.
I'd say fairly safe, with only few false positives, are:
- Function arguments that contain a union with
Any. (But not return types.) - Fields that contain
Any, either standalone or as union.
It's also not a complete disaster if a wrong Any gets changed to Incomplete, it just means it needs to be rechecked at some point.
Similar to : Any | None = ..., : Any | None\n seems like an equivalent stubgen artefact for class variables.
- Fields that contain
Any, either standalone or as union.
: Any | None\n seems very safe to do a search-and-replace for. For : Any\n, I feel a little queasy. But I'd feel more confident if we only replaced places which had, say, three Any attributes in a row, e.g.
class Foo:
a: Any
b: Any
c: Any
We could find those using regexes or AST.
Function arguments that contain a union with Any. (But not return types.)
Seems fine to me after running those we deem "safer".
Fields that contain Any, either standalone [...]
+1 What Alex said.
After : Any | None = ... and : Any | None\n there's 15 Any | None left. And except for mock they do mostly look like they could be Incomplete | None.
- 4 in mock
- 1 in pyOpenSSL
- 1 in redis
- 8 in SQLAlchemy
After a series of "autofixes". I can update the table above to see if there's still any obvious ones.
It might actually be fine to change every Any in old stubs to Incomplete. In theory, we'll go over the Incompletes later and change them back to Any if appropriate.
The only thing I'm a bit uneasy about is changing Any | None return types, because they are fairly common workarounds for the missing permissive union type, and they are not that easy to spot.
@srittau I agree with return types. I didn't include them in my table above for similar reasons.
TypeAliases should probably also be left untouched if I"m to guess.
Similar to
: Any | None = ...,: Any | None\nseems like an equivalent stubgen artefact for class variables.
Fancy taking this on as "stage 2", @Avasam? (Regardless of how far we want to go, I think I'd prefer to keep doing this in stages, so we can evaluate the risk level for each stage independently.)
so we can evaluate the risk level for each stage independently
plus we may learn things so the same can be done for stdlib
The only thing I'm a bit uneasy about is changing
Any | Nonereturn types, because they are fairly common workarounds for the missing permissive union type, and they are not that easy to spot.
We now have MaybeNone for those !
We now have
MaybeNonefor those !
I had the same thought while just reading through this thread again. :)
Since our current policy asks for all Anys to be commented, it's also easier to manually spot Anys that need either to be changed to Incomplete or commented. I'm not sure if it help much with an automated process, though.