athena-cloudtrail-partitioner
athena-cloudtrail-partitioner copied to clipboard
Lambda times out when before getting all partitions in CloudTrail bucket
The code doesn't work on a CloudTrail bucket with lots of data - for example a CloudTrail bucket with a years data from 100+ accounts and all regions.
Before the call in line 17 in the handler.js is finished, the lambda function reaches its execution time limit: to const partitionTree = await getAllParitions(bucket, path);
Also, please note that you have a minor typo in getAllParitions method - it should probably be getAllPartitions, but since the method is also spelled the same way in s3.js it doesn't really matter.
What does matter is, that it can take more than 15 minutes to enumerate a CloudTrail with lots of data. Is there a way you can store the enumeration data in DynamoDB as well, so multiple runs of the Lambda could allow it to pick up where it left?
I just raised the Lambda execution time to 15 minutes and the memory to 10 GB - but now the function fails because it runs out of memory:
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2020-12-16T14:58:02.038+01:00CopyREPORT RequestId: 171b6a40-d2f3-4cd7-b281-c0f66e659fcd Duration: 345439.53 ms Billed Duration: 345440 ms Memory Size: 10240 MB Max Memory Used: 10240 MB Init Duration: 536.94 ms
So 10 GB of memory on the Lambda function is not enough when running against an organizational CloudTrail bucket with lots of data.