powertools-lambda-python icon indicating copy to clipboard operation
powertools-lambda-python copied to clipboard

Bug: `log_uncaught_exceptions` only works locally and not on lambda

Open bellis-ai opened this issue 2 years ago • 21 comments

Expected Behaviour

When log_uncaught_exceptions=True, uncaught exceptions in a lambda are handled and transformed to a structured log.

Current Behaviour

Uncaught exceptions are printed unstructured.

Code snippet

from aws_lambda_powertools import Logger
from aws_lambda_powertools.utilities.typing import LambdaContext

logger = Logger(log_uncaught_exceptions=True)

def lambda_handler(event: dict, context: LambdaContext) -> str:
    raise(Exception("asdf"))

Possible Solution

Probably none. Document this behavior.

I don't think sys.excepthook can be overridden in a lambda container, but I have no proof because the original forum thread that details this has been deleted by AWS with no replacement.

Steps to Reproduce

Set log_uncaught_exceptions=True in an actual running lambda. If log_uncaught_exceptions=True on a local machine, the behavior runs just as intended.

AWS Lambda Powertools for Python version

latest

AWS Lambda function runtime

3.9

Packaging format used

Lambda Layers

Debugging logs

No response

bellis-ai avatar Dec 27 '22 19:12 bellis-ai

Thanks for opening your first issue here! We'll come back to you as soon as we can. In the meantime, check out the #python channel on our AWS Lambda Powertools Discord: Invite link

boring-cyborg[bot] avatar Dec 27 '22 19:12 boring-cyborg[bot]

Crossposted this on the re:Post forum: https://repost.aws/questions/QUlMvn9vDzRiyDsGnQ0SoztQ/how-do-i-override-the-sys-excepthook-in-a-lambda-python-container

bellis-ai avatar Dec 27 '22 19:12 bellis-ai

Hello @bellis-ai. Thanks for reporting this bug and posting the helpful links for review. We were able to reproduce the error in our local tests and are looking into how to resolve it. This failure was most likely because we made this update and we didn't have E2E tests for it.

We'll be back with an update here as soon as possible.

leandrodamascena avatar Dec 27 '22 22:12 leandrodamascena

Coming back here with feedback. We tried to find a solution and ran into the same problem: we couldn't get the excepthook to work inside the Lambda runtime and we couldn't catch exceptions using log_uncaught_exceptions=True. We are reaching out internally to the Lambda team to understand if there is any workaround to do in this case or if we are missing something.

Not sure if this is suitable for you, but for now you can use Middleware Factory to catch all exceptions and log them centrally. We had a discussion about this on our Discord Server and it might be helpful in your case.

image

Thank you for your patience @bellis-ai and we are working to resolve this as soon as possible.

leandrodamascena avatar Jan 02 '23 18:01 leandrodamascena

Updating this issue... We're working closely with the Lambda team to investigate the issue and determine if we have a workaround to resolve the issue.

I'll come back here as soon as I have an update. In the meantime, I'll update the documentation to suggest not using this code in production and putting this issue in a blocking state.

leandrodamascena avatar Jan 10 '23 14:01 leandrodamascena

Update: @leandrodamascena updated the docs!

Adding this to our board of activities so we can do bi-weekly follow ups with the Lambda team - (no update beyond acknowledge yet).

heitorlessa avatar Jan 31 '23 08:01 heitorlessa

Are there updates on this issue?

matteocontrini avatar Mar 09 '23 09:03 matteocontrini

Not yet @matteocontrini, meeting the team early next week for a follow up and post here if anything changes. Appreciate the patience

heitorlessa avatar Mar 09 '23 09:03 heitorlessa

Not yet @matteocontrini, meeting the team early next week for a follow up and post here if anything changes. Appreciate the patience

I was hoping to use this feature on a new project, then saw the note cautioning on using it, and found this open issue.

Did the meeting happen? Is there a plan to address this?

jth08527 avatar May 17 '23 18:05 jth08527

Meeting tomorrow one last time, Joshua. So far, we’ve discovered that all languages behave like that, and it’s a case of “it’s a feature not a bug” in the way Lambda catches all exceptions to report errors properly.

Should have a definitive answer soon as we keep on following up for potential workarounds

On Wed, 17 May 2023 at 20:49, Joshua Hasner @.***> wrote:

Not yet @matteocontrini https://github.com/matteocontrini, meeting the team early next week for a follow up and post here if anything changes. Appreciate the patience

I was hoping to use this feature on a new project, then saw the note cautioning on using it, and found this open issue.

Did the meeting happen? Is there a plan to address this?

— Reply to this email directly, view it on GitHub https://github.com/awslabs/aws-lambda-powertools-python/issues/1798#issuecomment-1551894497, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZPQBAZVZTZZKXO5GN7H6LXGUMUVANCNFSM6AAAAAATKVO5UE . You are receiving this because you were assigned.Message ID: @.***>

heitorlessa avatar May 17 '23 19:05 heitorlessa

Good news! Met with Lambda engineering to rule out options, and we have a potential solution for all runtimes.

I say potential because security review might take a while to ensure there's no angle for abuse.

Setting expectations

One caveat tho:

It might be applied to newer runtimes only (pending investigation). This means, if we can't backport this will only work for Python 3.11+.

Lambda will literally change the component responsible for handling every customer's request to ensure each programming language hook for unhandled exception works as expected without side effects or abuse - we'll take due diligence as expected for any component in the critical path.

We'll keep you posted; progress nevertheless.

I appreciate everyone's patience on this.

heitorlessa avatar May 26 '23 04:05 heitorlessa

@heitorlessa Hi, seeing that the Python 3.11 Lambda runtime has been released, are there any updates here? Thanks!

kbakk avatar Nov 20 '23 18:11 kbakk

Sadly no Kristoffer, the solution Lambda team thought it was possible didn’t end up being robust enough - they’re revisiting it after 3.12 is launched, as it affects all languages

On Mon, 20 Nov 2023 at 19:04, Kristoffer Bakkejord @.***> wrote:

@heitorlessa https://github.com/heitorlessa Hi, seeing that the Python 3.11 Lambda runtime has been released, are there any updates here? Thanks!

— Reply to this email directly, view it on GitHub https://github.com/aws-powertools/powertools-lambda-python/issues/1798#issuecomment-1819560056, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZPQBF547LRFG34YTNTDQLYFOLSZAVCNFSM6AAAAAATKVO5UGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJZGU3DAMBVGY . You are receiving this because you were mentioned.Message ID: @.*** com>

heitorlessa avatar Nov 21 '23 06:11 heitorlessa

I'm aware that under the hood the things differ, but if the only motivation for this issue is being able to see the unhandled exceptions logged in a JSON format, the introduction of advanced logging controls might make it superfluous. If I specify JSON log format in the Lambda function's logging configuration, all Lambda system events (including the unhandled exceptions) get logged in a JSON format.

image

tlinhart avatar Nov 21 '23 09:11 tlinhart

That unfortunately causes another unwanted side effect @tlinhart -- any warnings coming from libraries or your own emitted warnings will be dropped with this configuration.

The new ALC (Advanced Lambda Control) will look at the message, it doesn't have the required fields and it will convert to INFO. However your ALC log level is set to ERROR, so this will effectively drop any INFO (thus that warning) too.

heitorlessa avatar Nov 21 '23 10:11 heitorlessa

I usually capture the warnings using logging.captureWarnings(True) which I thought would suffice in this case. Also, I set the ALC to the most granular level and manage the log level solely in Powertools via the POWERTOOLS_LOG_LEVEL variable.

tlinhart avatar Nov 21 '23 10:11 tlinhart

I haven't tried but I suspect if you explicitly use captureWarnings(True) it might work because ALC feature changes the Root Logger.

If my memory serves, captureWarnings(True) configures a logger named py.warnings, which by itself isn't configured to the same log level. Since the root is, it should propagate, just like NullHandler from 3rd party libraries will propagate all debug logs once ALC is enabled to DEBUG (e.g., botocore, requests, etc.)

heitorlessa avatar Nov 21 '23 10:11 heitorlessa

TL;DR - we're still working with the AWS Lambda team on this, and related issues with the newly launched Advanced Logging Control like the warning above.

heitorlessa avatar Nov 21 '23 11:11 heitorlessa

@heitorlessa, any updates on how to get Python exceptions logged? I understand that the log_uncaught_exceptions feature is slated for removal, but we need some alternative method.

jmehnle avatar May 21 '24 22:05 jmehnle

Hey Julian!! Not yet, we’re still waiting to hear from Lambda whether this will be fixed on runtime for all languages. We don’t plan to remove the flag until we have absolute guarantee Lambda won’t solve it — disregard the v3 RFC for now.

A potential plan B is registering an internal Lambda extension (aka thread) that gives us 500ms guarantee for cleanups in Lambda (SIGTERM and environment recycling) — however I haven’t tested if the thread would receive the unhandled exception; if not it’d need something creative here.

A refresher on the situation

The reason this doesn’t work today is that Lambda Runtime (RIC) component catches all exceptions raised for any language, so it can report correctly that X happened, usage in ms, generate metrics, and handle other internal affairs — this is why no language unhandled exceptions native mechanisms work.

On Wed, 22 May 2024 at 08:28, Julian Mehnle @.***> wrote:

@heitorlessa https://github.com/heitorlessa, any updates on how to get Python exceptions logged? I understand that the log_uncaught_exceptions feature is slated for removal, but we need some alternative method.

— Reply to this email directly, view it on GitHub https://github.com/aws-powertools/powertools-lambda-python/issues/1798#issuecomment-2123537505, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZPQBDQWGEEOOZTXZQWDODZDPDALAVCNFSM6AAAAAATKVO5UGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRTGUZTONJQGU . You are receiving this because you were mentioned.Message ID: @.*** com>

heitorlessa avatar May 21 '24 23:05 heitorlessa

HELP WANTED -- here's a backup plan that wouldn't involve a no-op Lambda internal extension.

We know nearly all customers use the decorator @logger.inject_lambda_context to enrich the Lambda context, opportunistically log incoming events, set correlation ID dynamically etc.

We could transparently update that decorator to catch all exceptions, log the exception raised if log_uncaught_exceptions=True, and re-raise them to mimic as closely as possible Python's native uncaught exception capability.

What do you think?

That way, if Lambda ever fixes it, we can transparently remove the try/catch from within the decorator code. Either way this will start working for you in the next upgrade.


Related discussion about buffer debug logs where this could come in handy too.

https://github.com/aws-powertools/powertools-lambda-python/issues/4432#issuecomment-2147856521

heitorlessa avatar Jun 04 '24 15:06 heitorlessa