aws-sdk-pandas
aws-sdk-pandas copied to clipboard
Segmentation Fault during Lambda Execution
Describe the bug
Hello,
I work in AWS Support and I am raising a Github issue on behalf of a customer.
The customer uses the Lambda Layer for AWS Data Wrangler in their Lambda function to read json files from S3, create panda dataframes using awswrangler and process the file data, create a glue catalog table and store the flattened data files in S3 in parquet format. It was observed that a small number of Lambda invocation executions out of thousands fail with the error Runtime exited with error: signal: segmentation fault Runtime.ExitError
After having done some research, it appears as though this type of error is raised by a binary dependencies ( implemented in C/C++) utilized a Python module. From having looked at many resources on the web, the Pandas module in particular is known for raising this type of error for various different root causes.
I am trying to assist the customer with guidance on how to troubleshoot the root cause of their segmentation faults going forward, particularly on how to gather some more useful debug information. Currently the logs for the failed Lambda invocations end abruptly with the logged line Runtime exited with error: signal: segmentation fault Runtime.ExitError
and there is no insight into what is happening with Pandas and the related binary dependencies.
From what I gather from [1][2], it is not entirely straightforward to debug binary dependencies for Pandas for segmentation faults.
Can you please provide guidance on what steps can be taken to output verbose debug logging for the aws data wrangler layer and its binary extensions in the Lambda environment? In particular, it would be great if we could have steps to collect debugging and replication data so that we can come back with the information needed for troubleshooting for an issue on this repo. Any other insights you may have would be appreciated.
References:
[1] https://pandas.pydata.org/docs/development/debugging_extensions.html
[2] https://blog.richard.do/2018/03/18/how-to-debug-segmentation-fault-in-python/
How to Reproduce
Unfortunately, we do not have detailed steps on how to reproduce the issue other than the Lambda execution logs and the names of the files that were being processed.
The issue happened intermittently (only for a few invocations out of thousands). The customer noted that the issues only happened when they were utilizing version 5 of the Lambda Layer and have not happened since they upgraded to subsequent versions.
At this point we are seeking guidance on how best to gather debugging information to troubleshoot the issues in more detail.
Expected behavior
For the error not to happen and for the Lambda runtime not to exit.
Your project
No response
Screenshots
No response
OS
Amazon Linux 2, underlying OS for Python Lambda Runtime
Python version
3.9
AWS SDK for pandas version
arn: arn:aws:lambda:us-east-1:336392948345:layer:AWSDataWrangler-Python39:5
Additional context
No response
Hi @Alexander-Ludwig, in terms of debugging, my advice would be for the customer to first enable logging in their Lambda function code. They could add a line for pandas too:
logging.getLogger("pandas").setLevel(logging.DEBUG)
Hopefully this would give them more visibility into the error.
It's interesting that they haven't encountered this error after upgrading from version 5 of the layer. It could very well be that pandas introduced a fix in their latest version which we have released as part of the most recent awswrangler layers
Is this closed? I've been running into the same error
I get the same error when using python3-saml
library.