cloudtrail-parquet-glue
cloudtrail-parquet-glue copied to clipboard
Fixed partitioning issue during raw to parquet
Changed mappings in glue_etl.py to tie the Glue-given "partition_[0-6]" names to awslogs, account, region, etc. This should fix the errors being referenced in Issue 1.
As a minimum, what we need is a solution that understands all the variables that CloudTrail can introduce. it would be even better if there were the option to map the structure ourselves too as there are some different use cases, especially in enterprise level accounts where the data is not written directly by CloudTrail due Organizations not allowing child accounts to access the original trail data and do something like a "write-back" mechanism to a different S3 bucket.