boto3 icon indicating copy to clipboard operation
boto3 copied to clipboard

Generate and provide PEP-484 stub files for type-checking

Open OddBloke opened this issue 7 years ago • 23 comments

Having fleshed-out type definitions for the objects that boto3 generates at runtime would make writing bug-free code against boto3 a lot easier.

(The documentation generation code must already do the majority of the inference that would be required for this; could it be repurposed in some way?)

OddBloke avatar Apr 10 '17 21:04 OddBloke

We looked into this a while ago, but it ended up not being terribly useful for a few reasons. Right off the bat there's no good way to say boto3.client('s3') -> botocore.client.S3. Even if you get past that, all of our responses and most of our parameters are in the form of dicts with pre-defined keys, which afaik isn't representable all except as Dict[object, object] (or some amalgamation of unions). I'm pretty sure @jamesls also uncovered some other issues from talking to some of the mypy people at the last PyCon, but I don't recall exactly what those were.

Basically, to make type hints effective we would need to rewrite boto3 and botocore to be fully code-generated, which is something we don't currently have plans to do.

JordonPhillips avatar Apr 11 '17 18:04 JordonPhillips

You can cast thing with the typing, so people calling boto3.client could do that themselves (i.e. s3_client = cast(botocore.client.S3, boto3.client('s3'))) to opt-in to the typing. Not ideal, but not insurmountable.

That said, I don't think the clients are the most interesting place to see these; rather, I think the resources are. Looking at a boto3.resources.factory.ec2.Image, we have a bunch of str attributes, but also bools (.ena_support), lists (.block_device_mappings) and methods.

(For clients, the dicts would probably be Dict[str, Any]; I agree that this still isn't especially compelling.)

OddBloke avatar Apr 14 '17 18:04 OddBloke

That's a fair point, resources would be a much better candidate for stubs

JordonPhillips avatar May 10 '17 22:05 JordonPhillips

You can cast thing with the typing, so people calling boto3.client could do that themselves (i.e. s3_client = cast(botocore.client.S3, boto3.client('s3'))) to opt-in to the typing. Not ideal, but not insurmountable.

How does this actually work? There's no type botocore.client.S3. Also, botocore doesn't actually do codegen, so I'm not clear how this provides any value? Perhaps I'm misunderstanding?

Anyhow, I'd echo that having types for boto would be really useful.

b3ross avatar Oct 31 '17 00:10 b3ross

I just used Java and Javascript (Typescript) SDKs lately (had to implement roughly the same code on 3 platforms), and type information is definitively something that would be nice to have.

Comparing:

  • Java is java... super verbose, but all the type information is there, making auto-completion very fluent. If it was missing the types/auto-complete, and we had to follow http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/index.html during coding, that would drive people insane.
  • Typescript is way less verbose, plus it has type information. It's really really nice. I never even took a look at http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/index.html :smiley: because type information was sufficient (I did have to dive into sources many times).
  • Python (or boto3, rather) is like Typescript, but without type information. So, I have to follow http://boto3.readthedocs.io all the time. The documentation is in nice form (one page per service so searching by Ctrl+F is quick and easy, and docs describe all the fields with sufficient details), so it ain't that bad, but having type information there (on requests and responses) would still be great.

Most benefit would come from providing type stubs for requests and responses.

tuukkamustonen avatar Dec 01 '17 07:12 tuukkamustonen

@tuukkamustonen the good news is that it's possible to have types for dicts now! So that blocker is gone. But we still have the issue that, since clients are created from factory methods, the type checker won't know what class your client is.

JordonPhillips avatar Dec 01 '17 21:12 JordonPhillips

Also note that although TypedDict is out, it is experimental feature and not supported by IDEs (e.g. https://youtrack.jetbrains.com/issue/PY-24879).

tuukkamustonen avatar Dec 02 '17 12:12 tuukkamustonen

Even if you don't want to do a wholesale refactoring, just providing a set of types for responses would be super helpful. (I'm assuming there's some of specification that is generating the resources and clients). As an example, I made a type stub for the S3PutEvent (based on AWS docs). You could ship them as .pyi files alongside

It would be nice just to provide empty typed classes for easy mocking/type-checking too, e.g. here's a couple methods from Amazon Athena annotated.

Even though it's all factory methods, at least you could write something like:

import boto3
from boto3.type_stubs.s3 import S3Resource
s3: S3Resource = boto3.resource('s3')

or

import boto3
from boto3.type_stubs.athena import GetQueryExecutionResponse, AthenaClient
athena: AthenaClient = boto3.client('athena')
execution: GetQueryExecutionResponse = athena.get_query_execution(QueryExecutionId='...')

which would then let the type checker work quite well :)

The go AWS SDK generates all of these stubs so I'm guessing there's some dictionary somewhere that might work for it?

jtratner avatar Apr 10 '18 00:04 jtratner

I take the point about a full refactoring, but perhaps its possible to provide an additional factory API which would be typesafe - users could opt into that, and its implementation could just be a shim to the existing factory interface? That doesn't imply a change to fully code-generated as far as I can tell (but perhaps I'm missing something :)).

rbtcollins avatar May 29 '18 02:05 rbtcollins

Looking forward seeing this out.

tadas-subonis avatar Jun 03 '18 14:06 tadas-subonis

I don't understand the point about requiring full refactoring of boto3.... AWS already generates human-readable documentation like https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#EC2.Image.description

A solution could be to also generate machine-readable documentation in the form of stub classes.

yaroslavvb avatar Aug 30 '18 00:08 yaroslavvb

@yaroslavvb once you have the stub files generation taken care of, there's little reason for them to actually remain stub files. Say there will be a client stub like this:

class CloudWatch(BaseClient):
    def delete_alarms(self, param: ParamType) -> ReturnType ...

Then it's not really that big of a leap to turn it to this:

class CloudWatch(BaseClient):
    def delete_alarms(self, param: ParamType) -> ReturnType ...
        return self._make_api_call("DeleteAlarms", {"Param": param})

I experimented with this recently. I wrote a script that generates clients based on botocore data files. I just pushed it in case someone is interested: https://github.com/KyrychukD/botocore/tree/generate-clients/botocore/data I haven't had time to fit event emitter there properly, so the generated clients may not be fully functional.

dmytrokyrychuk avatar Aug 30 '18 06:08 dmytrokyrychuk

@KyrychukD is that the right link? I just see some data files that were last touched 1 month ago. If you had a list of generated stub files that can be used for type annotations, that would be pretty useful, could finally get decent completion/help in IDE's

yaroslavvb avatar Aug 31 '18 21:08 yaroslavvb

@yaroslavvb

@KyrychukD is that the right link? I just see some data files that were last touched 1 month ago.

The link is to the branch where I experimented with generating client classes based on the data files. I meant to link to the root of the repo instead of the data directory, though, sorry about that. The right link is https://github.com/KyrychukD/botocore/tree/generate-clients

If you had a list of generated stub files that can be used for type annotations, that would be pretty useful, could finally get decent completion/help in IDE's

Unfortunately, I do not have a complete solution for boto3 at the moment, all work I've done so far was for botocore.

dmytrokyrychuk avatar Sep 01 '18 20:09 dmytrokyrychuk

Lack of autocompletion for methods available when instantiating a client makes using boto3 very unpleasant. Please address this in a future version of boto3!

cervantek avatar Oct 08 '18 21:10 cervantek

Right off the bat there's no good way to say boto3.client('s3') -> botocore.client.S3.

Should now be possible with https://mypy.readthedocs.io/en/latest/literal_types.html

the form of dicts with pre-defined keys, which afaik isn't representable all except as Dict[object, object]

https://mypy.readthedocs.io/en/latest/more_types.html#typeddict

ikonst avatar Feb 05 '19 06:02 ikonst

Microsoft's Azure looks better designed, just sayin ;)

https://github.com/Azure/azure-sdk-for-python

(I'm trying to light the fire to get AWS to try on their SDK)

four43 avatar Feb 28 '19 22:02 four43

Is there any hope in getting type checking in boto3? Should we be looking to third party libraries to be able to provide stubs if this capability won't be introduced in the core library anytime soon? I am specifically looking at the boto3_type_annotations project which has an active PR to convert to the PEP-561 style of stubs. The project's maintainer opened up an issue on this project to get feedback on the approach, at which point he was directed to an alternative repo that AWS has (https://github.com/boto/botostubs) -- though that has been caveated to not use in production and hasn't been touched for about 5 months now. I think there was some mixed signals with them saying:

We have a working implementation of boto3 type stubs internally, that we were waiting for a python 3.8 PEP (specifically https://www.python.org/dev/peps/pep-0586/) to be released before publishing.

Python 3.8 is now out... are there updates coming soon to make "production ready" typing information available?

sdavids13 avatar Nov 20 '19 02:11 sdavids13

Hello! I have created a mypy_boto3 project that generates annotations from boto3 docstrings and is compatible with mypy: https://github.com/vemel/mypy_boto3 If is based on boto3_type_annotations but I added a couple of new features:

  • boto3-stubs module for mypy support
  • services splitted to submodules, as full installation is huge
  • pyi files for boto3
  • types for dict arguments and responses

Let me know if this is useful.

vemel avatar Nov 22 '19 01:11 vemel

Any updates to this? Would be extremely helpful to speed up development and prevent bugs.

hoffa avatar Jul 21 '22 16:07 hoffa

I brought this topic up for discussion with the team and the consensus was that this feature request is not currently under consideration.

Integrating type hints will be considered for the next major version of boto3. In the meantime there are packages such as https://pypi.org/project/boto3-stubs/ you can use for type checking.

tim-finnigan avatar Jan 12 '23 18:01 tim-finnigan

@tim-finnigan is there any tracking issue/project for the next major version? Is/will that be boto4?

fitzoh avatar Jan 12 '23 18:01 fitzoh

@fitzoh no there is not currently a tracking issue or project for the next major version. There will be an official announcement prior to then. (I can’t provide any guarantee as to when that would be.) In the meantime we are using the needs-major-version label to identify issues worth tracking for consideration in the next major version.

tim-finnigan avatar Jan 12 '23 18:01 tim-finnigan