jsii icon indicating copy to clipboard operation
jsii copied to clipboard

🎯 Python Experience Improvements

Open RomainMuller opened this issue 5 years ago • 29 comments

Preamble

This is a tracking issue summarizing the existing pain points in Python front-ends generated by jsii-pacmak. Members of the community are more than welcome to contribute to this, for example by sharing their pet peeves in the communication thread below.

The initial post will be periodically updated to include a summary of the items that are accepted as needed improvements to our Python offering.

Known Pain Points

  • [x] aws/aws-cdk#474 - Incomplete CDK documentation due to quirks in sphinx
    • Nested types are missing
    • Many parameters are showing typed as Forwardref
  • [x] Incorrectly transliterated sample code in the cdk documentation
    • Type information is necessary in many cases, and jsii-rosetta does not require that examples compile
    • In many cases, it's only a matter of adding a rosetta fixture
  • [ ] Un-pythonic reliance on a custom metaclass
    • Metaclasses complicate extension: all sub-classes in a class family must eventually derive from the same metaclass
    • Python doesn't really have interfaces, the idiom is using Protocols
      • Protocols can be implicitly implemented (structural typing)
      • Explicit implementation uses the extension syntax, but this cannot work due to metaclass dissonance
  • [ ] Passing dicts in lieu of jsii structs does not consistently work
    • Case adjustments of values is not always consistently done, resulting in runtime errors or worse (silently ignored properties, incorrect/unexpected side-effects, ...)
  • [ ] Type-checking errors from @jsii/kernel are obscure

Discussion

Incomplete CDK Documentation

The AWS CDK API reference documentation is currently generated using a private tool, which leverages the idiomatic documentation generator for each supported language (docfx, javadoc, sphinx, ...). While this has the advantage of giving users the same documentation experience that they're used to for their language of choice, it has several disadvantages:

  • it is an awkward experience for users building polyglot applications, since each language's documentation set is entirely separate from the others
  • we have to suffer the limitations of the idiomatic tools
    • customizing templates is often a difficult process
    • certain jsii features are broken or weird in some languages (i.e: sphinx and nested types)

Additionally, the current generator being private means other users of jsii are not able to produce a documentation site that looks and feels like the AWS CDK reference. One of the goals of jsii being that developers need not learn all about the standard tools of front-end languages, providing a jsiidoc tool would greatly reduce friction around documenting polyglot APIs. This is however a significant project, and shorter-term solutions (such as using autodoc for Python with generated .rst files).

Incorrectly Transliterated Sample Code

The jsii-rosetta tool is able to collect code example snippets from various locations in the jsii assemblies, and attempts to automatically transliterate those to other front-end languages. It delivers the best fidelity when it is able to obtain complete type information around the example (i.e: it is valid and compiling TypeScript code); and must resort to broad assumptions otherwise.

In the context of AWS CDK, many example snippets lack context information necessary in order to obtain complete type information, which often leads to incorrectly transliterated code. In the Python context, this can mean the transliterated code uses keyword arguments at a location where they are not actually supported.

There is a --fail option to jsii-rosetta which will cause it to abort if an example is not compiling (the corollary of this being there cannot be sufficient type information). Enabling this behavior has two benefits:

  • It guarantees example code is up-to-date with the current library's API
  • Complete type information should be available, enabling the best possible transliteration result

This feature is however not enabled by default on the AWS CDK, as many examples there do not have the full context needed for them to compile. In many cases, all that is needed is the introduction of fixtures that provide the necessary context while not weighting example code shown in the documentations with repetitive boilerplate.

Un-pythonic reliance on a custom metaclass

The jsii runtime for Python provides a JSIIMeta metaclass that is used by all generated bindings for classes (classes from the jsii assembly). As a consequence, attempting to leverage Python's multiple inheritance (for example, to implement an interface explicitly) may lead to a runtime error, as all bases of a Python class must be rooted on the same metaclass.

Additionally, the JSIIMeta metaclass is merely used to record jsii-specific metadata on generated classes; but since those are code-generated, it would be trivial to register the exact same metadata as part of the generated declarations, rendering the metaclass useless.

Not relying on the metaclass would mean users are able to implement interfaces explicitly in the idiomatic way, which would provide better information to static analyzers such as mypy, to help ensure developers have correctly implemented an interface, and are correctly passing values that conform to interfaces to method parameters.

# The slightly awkward:
@jsii.implements(lib.IInterface)
class Implementor():
    ...

# Could become:
class Implementor(lib.IInterface):
    ...

Passing dict in lieu of jsii structs does not consistently work

Similar to the Javascript idioms, Python developers will often intuitively use dict literals in places where the API expects some instance of a jsii struct. The generated Python code has specific code paths to convert a dict to the correct struct instance, however there seem to be many cases where this conversion is not happening, leading to:

  • type-checking errors in the @jsii/kernel module
  • property keys not being translated from snake_case to camelCase
  • properties being dropped entirely by the @jsii/kernel

The workaround forces users to resort to relatively verbose boilerplate code, which in turns looks very un-pythonic.

# The somewhat heavy-handed
s3.Bucket(self, 'Gold', lifecycle_rules=[s3.LifecycleRule(expiration=cdk.Duration.days(15)])

# Could become:
s3.Bucket(self, 'Gold', lifecycle_rules=[{ 'expiration': cdk.Duration.days(15) }])

Type-checking errors from @jsii/kernel are obscure

When a user makes a programming mistake and passes the wrong kind of value to an API (be it a parameter of some method or constructor, or the value being set to some property), the errors generated by the @jsii/kernel are not very helpful to developers as:

  • They no not relate to the user's native language type names
  • They may lack sufficient context to be parseable by the user (for example, when the error occurs in a deeply-nested value)

RomainMuller avatar Aug 19 '20 12:08 RomainMuller

I know you're working on it already, but might want to put some form of "runtime type validation" in there as well. I saw a LOT of issues where users confused L1 and L2 classes.

rix0rrr avatar Aug 19 '20 13:08 rix0rrr

Incomplete CDK Documentation

What you're writing about a better tool is correct, but is a big project. There's probably some effective short-term things we can do instead (for example: stop relying on autodoc for Python and just starting to generate RST docs ourselves).

Un-pythonic reliance on a custom metaclass

I'm missing some concrete examples here. Are you proposing to replace:

@jsii.implements(cdk.ISomething)
class Boo:
   ...

# with
class Boo(cdk.ISomething):
   ...

?

rix0rrr avatar Aug 19 '20 13:08 rix0rrr

I'm missing some concrete examples here. Are you proposing to replace:

@jsii.implements(cdk.ISomething)
class Boo:
   ...

# with
class Boo(cdk.ISomething):
   ...

Yes, exactly that.

RomainMuller avatar Aug 19 '20 13:08 RomainMuller

Did you recently change the method signatures to include line-breaks? That was my biggest pet peeve but it seems to be fixed now.

richardhboyd avatar Aug 19 '20 17:08 richardhboyd

I'd love to have a way to unit test my stacks. The assert module only works with TypeScript AFAIK. We now have to use snapshot tests. Ie. generate cloudformation template and compare it with a known and validated one. Then if the test fails, we get two huge strings that don't match. It works, but it's not atomic.

pepastach avatar Aug 20 '20 06:08 pepastach

Also fix the pip-installation: https://github.com/aws/aws-cdk/issues/3406#issuecomment-525688693

brainstorm avatar Aug 20 '20 08:08 brainstorm

@brainstorm can you elaborate a bit more on "fix"? Are you referring to one large package that includes all CDK modules instead of individual modules per service?

richardhboyd avatar Aug 20 '20 10:08 richardhboyd

Not instead, but in addition. pip install aws-cdk should pull all the aws-cdk-* submodules avoiding ImportErrors. You can as well just install the just modules you need by being explicit about the deps (i.e only S3 or Lambda CDK submodules).

Now, I know this could be seen as "bloat", but from a DX perspective, I reckon it's the better decision IMO?

brainstorm avatar Aug 20 '20 10:08 brainstorm

We're working on something similar with monocdk

richardhboyd avatar Aug 20 '20 11:08 richardhboyd

Adding a section to the documentation on unit testing with examples. It would be super slick to have this integrated with the cli, ie. cdk run-tests

adamjkeller avatar Aug 20 '20 18:08 adamjkeller

awkward experience when looking at docs indeed

kristianpaul avatar Aug 21 '20 16:08 kristianpaul

Did you recently change the method signatures to include line-breaks? That was my biggest pet peeve but it seems to be fixed now.

We used to run generated Python code through black for a while. Reverted because it was too slow, but we're generating friendlier code now (lines ideally < 90 chars, but sometimes longer because we couldn't roll out a full fledged Python parser here).

RomainMuller avatar Aug 24 '20 17:08 RomainMuller

Adding a section to the documentation on unit testing with examples. It would be super slick to have this integrated with the cli, ie. cdk run-tests

This is interesting but could be tricky (because we'd likely end up having to be opinionated with respects to test frameworks, and opinions usually don't get you a lot of new friends 🤣).

We've considered this as "living with the languages' idioms", so we'd expect you want to do it "like you normally test". That seems to go against the idea of integrating this in the cdk CLI, but well... if this is what the majority wants....

RomainMuller avatar Aug 24 '20 17:08 RomainMuller

I'd like to add some notes about my experience using CDK with type checking (specifically Pyright/Pylance in Visual Studio Code):

  • class properties (such as Runtime.PYTHON_3_8 from aws_cdk.aws_lambda) are not seen as proper class properties by type checkers. Instead, they are seen as normal functions, requiring a manual typing.cast() to use these values where needed
  • implementations of interfaces (such as IHttpRouteIntegration implemented by LambdaProxyIntegration in aws_cdk.aws_apigatewayv2_integrations) are not seen as true implementations, since the only indication of this relationship is through a custom decorator that the type checker doesn't know about. As a result, this also requires using typing.cast to use these values where needed

jarrodldavis avatar Feb 16 '21 19:02 jarrodldavis

The interfaces should be declared as Protocols, which means that normally, if they are correctly implemented, structural typing should make them match... Although I'd say I have seen cases where mypy would not seem to be impressed. I'm not too sure how to address, especially as metaclasses are otherwise involved (and they complicate everything in that flow)

RomainMuller avatar Apr 01 '21 08:04 RomainMuller

pyright is compaining about virtually all interfaces for me, is there a path to fixing this?

gshpychka avatar Apr 14 '21 09:04 gshpychka

I am also experiencing the interface issues described above. We're looking to adopt the CDK for our main platform and as a mostly Python/C++ shop the Python CDK seemed like a great choice for us. I didn't make it very far into my first sample app to test things out before mypy started throwing errors.

from aws_cdk.core import Construct, Stack
from aws_cdk import aws_ec2 as ec2

class MyEC2Stack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        ec2.Instance(self, 'my-app',
            instance_type=ec2.InstanceType('m5.4xlarge'),
            machine_image=ec2.MachineImage.generic_linux({
                'us-east-1': 'ami-0affd4508a5d2481b', # CentOS 7 (x86_64) - with Updates HVM
            }),
            vpc=ec2.Vpc(self, "my-app-vpc"))

image

I would love to see this fixed so we can start building out our application stack with confidence that we're passing the right level of constructs around!

dmartin-gh avatar Apr 21 '21 23:04 dmartin-gh

class properties (such as Runtime.PYTHON_3_8 from aws_cdk.aws_lambda) are not seen as proper class properties by type checkers. Instead, they are seen as normal functions, requiring a manual typing.cast() to use these values where needed

Seeing this as well in multiple IDEs. I swear I didn't see it before, but maybe it was just ignored by my IDE before. Code certainly works, just need the right @thing perhaps. I couldn't figure out which one.

polothy avatar Jun 18 '21 18:06 polothy

EDIT: Sorry I think this might actually be the wrong Issue I had bookmarked but I cannot for the life of me find the JSII / CDK issue discussing lack of intellisense and docstrings for CDK packages.

--

I wanted to offer a cool feature I found in PyCharm, period, but also could be used to generate something better.

In PyCharm: SETTINGS -> TOOLS -> EXTERNAL DOCUMENTATION

Add a new source as below:

# Module Name
aws_cdk
# URL/Path Pattern
https://docs.aws.amazon.com/cdk/api/latest/python/{module.name}/{class.name}.html#{module.name}.{class.name}

image

Outcome

  • It leaves much to be desired however a built-in URL to docs is an interesting first step.
  • The docs can also be opened with SHIFT+F1 on Windows.

Speculation

  • Could this potentially lead to pointing to a local file which CAN be read, notably by WSL2 Python interpreter + PyCharm IDE?
  • Could a plugin be written to GET https://docs.aws.amazon.com/cdk/api/latest/python/aws_cdk.aws_ec2/ClientVpnEndpoint.html#aws_cdk.aws_ec2.ClientVpnEndpoint and output?

image

osulli avatar Sep 08 '21 11:09 osulli

I'm also having issues using Pyright and interfaces, same as @jarrodldavis described.

CarlosDomingues avatar Oct 13 '21 20:10 CarlosDomingues

Another quirk of the un-Pythonic metaclass requirement: invalid attributes (of a class that uses JSIIMeta as a metaclass) don't throw AttributeErrors. Instead they return quietly as None...

I suspect there's some clever way of overloading __getattr__ that is necessary for jsii vars, but doesn't handle the case of non-existent Python variables as expected.

rirze avatar Apr 15 '22 16:04 rirze

About passing dicts:

I am playing with an adjustment of https://github.com/aws/jsii/blob/main/packages/jsii-pacmak/lib/targets/python.ts

In case where name_mappings are present instead of only property getter also setter could be emitted.

These setters could then cast in case a dict is found and the init would use those setters.

This is the cast function:

def cast(klass, value):
    if value.__class__.__name__ == 'dict':
        return klass(**value)
    else:
        return value

For example CfnBucketProps from cdk:

In init:

        if logging_configuration is not None:
            self._values["logging_configuration"] = logging_configuration

becomes:

        if logging_configuration is not None:
            self.logging_configuration = logging_configuration

And a setter is added:

    @logging_configuration.setter
    def logging_configuration(
        self,
        value: typing.Optional[typing.Union["CfnBucket.LoggingConfigurationProperty", _IResolvable_da3f097b]],
    ) -> None:
        self._values["logging_configuration"] = cast(CfnBucket.LoggingConfigurationProperty, value)

This would allow those dicts :-) What do you think about this?

Jacco avatar May 08 '22 15:05 Jacco

@Jacco one property of jsii structs is that they are pure-data, immutable objects. Adding setters to those would break the immutable property.

So basically, we'd need those setters to be private, if we want to go down that path... Otherwise, we'll have to carefully consider the semantic implications of allowing mutations here in violation of the jsii type system (it might be fine, but I wouldn't want to take that decision lightly)

RomainMuller avatar Jun 29 '22 09:06 RomainMuller

@rirze

Another quirk of the un-Pythonic metaclass requirement: invalid attributes (of a class that uses JSIIMeta as a metaclass) don't throw AttributeErrors. Instead they return quietly as None...

Do you have an example of code that has this behavior? I'm a bit surprised here, as the metaclass doesn't directly override __getatt__, and our dynamic proxy objects (appear to) appropriately raise AttributeError when attempting to use a non-existing property...

There might be an edge-case I'm not seeing and I'd like to step through some code that behaves incorrectly to get a sense of what's going on...

RomainMuller avatar Jun 29 '22 09:06 RomainMuller

@dmartin-gh

I am also experiencing the interface issues described above. We're looking to adopt the CDK for our main platform and as a mostly Python/C++ shop the Python CDK seemed like a great choice for us. I didn't make it very far into my first sample app to test things out before mypy started throwing errors.

from aws_cdk.core import Construct, Stack
from aws_cdk import aws_ec2 as ec2

class MyEC2Stack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        ec2.Instance(self, 'my-app',
            instance_type=ec2.InstanceType('m5.4xlarge'),
            machine_image=ec2.MachineImage.generic_linux({
                'us-east-1': 'ami-0affd4508a5d2481b', # CentOS 7 (x86_64) - with Updates HVM
            }),
            vpc=ec2.Vpc(self, "my-app-vpc"))

image

I would love to see this fixed so we can start building out our application stack with confidence that we're passing the right level of constructs around!

Is this still a problem today? I have been trying to reproduce this (with CDK v2, though), and am not getting the MyPy error you report here... if you still happen to have the error, maybe this is because of your specific MyMy version or configuration (in which case, I would like to know what these are so I can reproduce).

RomainMuller avatar Jun 29 '22 10:06 RomainMuller

@RomainMuller

Is this still a problem today?

It's much less of a problem after https://github.com/aws/jsii/pull/2809, but it still exists due to https://github.com/aws/jsii/issues/2927 / https://github.com/aws/aws-cdk/issues/15584

gshpychka avatar Jun 29 '22 10:06 gshpychka

It would also be nice if interfaces/protocols could be runtime checkable.

@jsii.implements(aws_cdk.IAspect)
class ConfigureConnections:
    def visit(self, node):
        # Raises error 
        # only protocols decorated with `@runtime_checkable` can be used for instance/subclass checks
        if not isinstance(node, ec2.IConnectable):
            return
        # ...
Aspects.of(scope).add(ConfigureConnections(my_resources))

Is there a workaround for this shortcoming other than manually implementing protocol checking?

spyoungtech avatar Jan 04 '23 22:01 spyoungtech

Passing dict in lieu of jsii structs does not consistently work

Similar to the Javascript idioms, Python developers will often intuitively use dict literals in places where the API expects some instance of a jsii struct.

For me this is the one thing that stops cdk8s with Python being widely usable. Is there any hope that it will be fixed?

LS80 avatar Oct 12 '23 14:10 LS80