spark-on-aws-lambda
spark-on-aws-lambda copied to clipboard
feat: Add complete Apache Iceberg integration with AWS Glue Catalog s…
…upport
- Add comprehensive Iceberg functions library (libs/glue_functions/iceberg_glue_functions.py)
- Implement production-ready Lambda handlers for Iceberg table operations
- Add time travel queries and metadata access capabilities
- Include advanced features: schema evolution, table history, snapshots
- Provide complete ETL pipeline with data quality checks
- Add CloudFormation infrastructure templates for AWS deployment
- Include comprehensive usage examples and documentation
- Support for reading Iceberg tables from AWS Glue Catalog
- Full S3 integration with proper AWS credentials handling
- Production deployment scripts and cleanup utilities
Key Features:
- Read Iceberg tables with time travel support
- Query table metadata (history, snapshots, schema)
- Complete Spark session configuration for Lambda
- Error handling and comprehensive logging
- Multiple Lambda handler templates for different use cases
- Infrastructure as Code with CloudFormation
- End-to-end testing and validation scripts
Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.