powertools-lambda-python icon indicating copy to clipboard operation
powertools-lambda-python copied to clipboard

RFC: Event Handler for API Gateway Web Sockets

Open walmsles opened this issue 2 years ago • 10 comments

Is this related to an existing feature request or issue?

#1165

Which Powertools for AWS Lambda (Python) utility does this relate to?

Event Handler - REST API

Summary

To provide an Event Handler class to enable the implementation of Web Socket routes for API Gateway Web Socket connections.

Background

Web Socket connections are serviced by API routes that behave in particular ways. I recently wrote about this in an article here. The implementation of the Web Socket class should take into account current Web Socket route nuances and enable customers to create Web Socket implementations quickly and with significantly less boilerplate code and with a similar implementation pattern - at the end of they day it is all about defining routes.

Use case

Enabling implementation of API Gateway Web Socket routes using exactly the same style of code as existing API Gateway implementations and enabling familiar handling of Web Socket routes.

Proposal

before:

from aws_lambda_powertools import Logger, Tracer
from aws_lambda_powertools.logging import correlation_paths
from aws_lambda_powertools.utilities.typing import LambdaContext

tracer = Tracer()
logger = Logger()

@tracer.capture_method
def socket_connect():
    // do connection processing
    return {"statusCode": 200}

@tracer.capture_method
def order_notify():

    // do notify processing
    return {
         "statusCode": 200,
         "body": "order# 1234 Created",
    }

@tracer.capture_method
def socket_disconnect():

    // do connection processing

    return {"statusCode": 200}

# You can continue to use other utilities just as before
@logger.inject_lambda_context(correlation_id_path=correlation_paths.API_GATEWAY_REST)
@tracer.capture_lambda_handler
def lambda_handler(event: dict, context: LambdaContext) -> dict:
    if event.requestContext.routeKey == '$connect':
        return socket_connect(event, context)
    elif event.requestContext.routeKey == '$disconnect': 
        return socket_disconnect(event, context)
    else:
        return route_not_found(event, context)


after:

from aws_lambda_powertools import Logger, Tracer
from aws_lambda_powertools.event_handler import APIGatewayWebsocketResolver
from aws_lambda_powertools.logging import correlation_paths
from aws_lambda_powertools.utilities.typing import LambdaContext

tracer = Tracer()
logger = Logger()
app = APIGatewayWebsocketResolver()


@app.route("$connect")
@tracer.capture_method
def socket_connect():
    // do connection processing
    return {"statusCode": 200}

@app.route("order/notify")
@tracer.capture_method
def order_notify():

    // do connection processing
    return {
         "statusCode": 200,
         "body": "order# 1234 Created",
    }

@app.route("$disconnect")
@tracer.capture_method
def socket_disconnect():

    // do connection processing

    return {"statusCode": 200}

@app.route("$disconnect")
@tracer.capture_method
def socket_disconnect():

    // do connection processing

    return {"statusCode": 200}

# You can continue to use other utilities just as before
@logger.inject_lambda_context(correlation_id_path=correlation_paths.API_GATEWAY_REST)
@tracer.capture_lambda_handler
def lambda_handler(event: dict, context: LambdaContext) -> dict:
    return app.resolve(event, context)

Out of scope

The implementation must be specifically focused on route definitions and handlers with utilities for common use-cases only and not touch any cloud infrastructure (as per tenets).

Potential challenges

End-to-End testing is one potential challenge - unsure if this style of interface exists, so async testing with live infrastructure will require careful thought and planning.

We need to carefully navigate route definition and look at what we do when no $connect route exists (which implies NO authorisation/authentication in the API gateway solution). Infrastructure as code tools should cater for this nuance, but we need to decide how to handle this situation specifically, which creates a cloud vulnerability.

We need to consider route definition carefully and decide on where this fits in the class hierarchy. Potentially this should exist separately from APIGatewayResolvers, given there are no HTTP Methods and other characteristics. There are Headers but only inbound within the actual event - never outbound.

Lots to think about and discuss, I would like to see a familiar Resolver style implementation for WebSocket routes - they are analogous to APIgatewayResolver routes in many ways. Custom and $default routes are capable of returning an actual response which is sent back to the calling web socket connection in the same way as any other broadcast message.

Dependencies and Integrations

No response

Alternative solutions

Custom implementation with boilerplate code.

Acknowledgment

walmsles avatar Oct 11 '23 11:10 walmsles

Hi @walmsles! Thanks for opening this RFC to add support for WebSockets, this has been a long-time wish and I'm happy in getting this resolved now.

I'll add this for review next week, okay? If you have a PoC to share with us, that would be great.

Thanks

leandrodamascena avatar Oct 12 '23 13:10 leandrodamascena

Hi @leandrodamascena, no POC yet. No rush. Keen to hear from others in the community about what would make sense for a WebSocket event handler, use cases, existing problems and friction points.

The solution needs to remove boilerplate code and increase developer velocity.

Other thoughts that come to mind:

  • Not defining a $connect route should trigger a warning in debug mode (is generally considered a security issue) - maybe this is always?
  • Propose a new class hierarchy in the same fashion as AppSync since it is a completely different protocol and feature set
  • Must include middleware capability - this makes sense for WebSocket handlers and is a useful feature. Could look to create a new BaseRoute class and push function handler and middleware processing to the new base (DRY).

walmsles avatar Oct 14 '23 05:10 walmsles

@leandrodamascena - I'd like to help with this.

stevrobu avatar Oct 17 '23 14:10 stevrobu

Hi @stevrobu! Yes, help us create this new Resolver in Event Handler! For now, we need help discussing the best developer experience and the possible problems/benefits of integrating WebSocket into our utility. Please feel free to make comments, create a PoC, ask questions, and do whatever you can to contribute to this.

Thanks!

leandrodamascena avatar Oct 17 '23 20:10 leandrodamascena

@leandrodamascena - I have a POC completed. I will get it checked into my branch and add comments/documentation for it by end of the week.

stevrobu avatar Oct 30 '23 14:10 stevrobu

It would be nice to separate out the different route types. My example above used a central route definition even for the defined routes. But I think it would be nice to represent the defined WebSocket routes as distinct middleware handlers.

from aws_lambda_powertools import Logger, Tracer
from aws_lambda_powertools.event_handler import APIGatewayWebsocketResolver
from aws_lambda_powertools.logging import correlation_paths
from aws_lambda_powertools.utilities.typing import LambdaContext

tracer = Tracer()
logger = Logger()
app = APIGatewayWebsocketResolver()


@app.connect()
@tracer.capture_method
def socket_connect():
    // do connection processing
    return {"statusCode": 200}

@app.default()
def socket_default():
     // do default route handling (catch-all)
    return {"statusCode": 200}


@tracer.capture_method
def socket_connect():
    // do connection processing
    return {"statusCode": 200}


@app.route("order/notify")
@tracer.capture_method
def order_notify():

    // do connection processing
    return {
         "statusCode": 200,
         "body": "order# 1234 Created",
    }

@app.disconnect()
@tracer.capture_method
def socket_disconnect():

    // do connection processing

    return {"statusCode": 200}


# You can continue to use other utilities just as before
@logger.inject_lambda_context(correlation_id_path=correlation_paths.API_GATEWAY_REST)
@tracer.capture_lambda_handler
def lambda_handler(event: dict, context: LambdaContext) -> dict:
    return app.resolve(event, context)

Common Web socket use cases that come to mind:

  1. Async Task processing (executing longer-running tasks)
  2. Submitting Events to an EDA of some form
  3. Channel notifications - web socket connections subscribe to one or more channels for broadcast event notifications

Is it useful to build adapters for these use cases to wrap up some common tooling developers need to do to accelerate the development of these use cases?

Not sure exactly what these should look like but things that come to mind:

  • Use Case 1 - this is really just a different form of Use case 2 IMO - a different EDA mechanism for launching a long running task
  • Use case 2 - provide an EDA provider enabling users to inject a event submission provider with a transformer to transform the API data into the event pattern required. The actual implementation can be included for EventBridge, SQS and SNS as immediate starters to accelerate these use cases.
  • Use case 3: Provide a storage adapter for storing connection data into a data store (optional addition). We could then create a class that represents a WebSocketChannelBroadcast use-case which automates the saving and deletion of connections from a data store injected via an adapter pattern (like the Idempotency Storage Provider idea).

These are typical use cases and all involve a WHOLE heap of boiler plate code which we could provide to accelerate use cases rather than just creating a simple Web Socket adapter imitating the APIRestResolver patterns. The APIRestResolver style pattern is still needed as the basis for adding these kinds of use cases on top.

walmsles avatar Nov 03 '23 23:11 walmsles

I have a POC in my fork here: https://github.com/stevrobu/powertools-lambda-python/tree/2023-10-30-stevrobu-add-websocket-event-manager

To test it out:

  1. Clone https://github.com/stevrobu/powertools-lambda-python
  2. Switch to 2023-10-30-stevrobu-add-websocket-event-manager
  3. Upload the powertools-for-lambda-python-poc.zip to an S3 bucket in your account. This zip includes the powertools changes for the WebSocket Event Handler. The SAM template will use this to create a Lambda Layer for the test Lambda.
  4. From root of cloned folder: cd ./examples/event_handler_websocket
  5. sam build
  6. sam deploy --parameter-overrides LayerBucketName=<Name of Bucket Containing zip File>
  7. Copy the WebSocketURI output from the SAM deploy
  8. In CloudShell run wscat to connect: wscat -c <URI Value from SAM Deploy Output>
  9. Send the following to the connection: {action: 'joinroom'}
  10. Close the connection

Note that this is just a very basic start. There are a number of additional topics to discuss including but not limited to:

  1. Helper methods for standard routes: i.e. app.connect() instead of app.route('$connect').
  2. Handling compression
  3. Middlewares have been removed for this POC
  4. Ensuring use cases mentioned above are covered.

stevrobu avatar Nov 06 '23 19:11 stevrobu

hey @stevrobu - is there a doc or code snippet you could share about this POC? With the recent major addition of Data Validation that drastically changes the authoring experience, we might be in a better place to discuss DX for WebSockets next year.

I'm still torn on whether this is worth investing compared to doubling down on AppSync Event Handler - would love to hear from customers who are using API Gateway WebSockets, and what did they have to come up themselves boilerplate wise to get it working in prod

heitorlessa avatar Dec 15 '23 13:12 heitorlessa

I want to come back to this one. I implemented a "long-running task" via websocket Typescript CDK construct last year and wrote about this in a blog article.

Need to be hands-off infrastructure - for API Gateway websockets. There is quite a lot of infrastructure required for surrounding elements for broadcast etc. Focus on boiler-plate code - getting the right code in place to accelerate this use-case with prescriptive guidance.

  1. Broadcast use-case - implementation similar to Idempotent utility makes sense here with storage provider for connection data management so can sub-in other implementations people need. Need to be clear that storage is not required for anything but a broadcast use case - people don't understand that.
  2. Broadcast Notification Processing
    1. finding connections to send notifications to based on "channel" for notification (using StorageProvider)

@stevrobu - Where are you at with your POC? Be good to review - can you provide a sample of the user experience and what you have implemented?

walmsles avatar Feb 25 '24 03:02 walmsles

@walmsles, @heitorlessa - I apologize for the delay in getting back. These message were lost in shuffle. As far as what I have implemented, it is in my post from Nov 6 above. The sample code is here: https://github.com/stevrobu/powertools-lambda-python/tree/2023-10-30-stevrobu-add-websocket-event-manager/examples/event_handler_websocket

I put the POC code in here: https://github.com/stevrobu/powertools-lambda-python/blob/2023-10-30-stevrobu-add-websocket-event-manager/aws_lambda_powertools/event_handler/api_gateway.py but you will need to use a lambda layer as stated in the instructions for running the example: https://github.com/stevrobu/powertools-lambda-python/tree/2023-10-30-stevrobu-add-websocket-event-manager/examples/event_handler_websocket

There is not a whole lot here beyond adding the python decorators so that the experience is similar to the REST event handler. I'd love to discuss the use cases more.

stevrobu avatar Mar 12 '24 19:03 stevrobu