amazon-emr topic
aws-dbs-refarch-datalake
Reference Architectures for Datalakes on AWS
demo-code
Bits of code I use during live demos
modern-data-lake-storage-layers
Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work
dataflow-runner
Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR
aws-airflow-demo
Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for Apache Airflow (MWAA) on AWS.
emr-demo
Project files for the post: Running PySpark Applications on Amazon EMR: Methods for Interacting with PySpark on Amazon Elastic MapReduce.
terraform-emr-spark-example
An example Terraform project that will configure a Secure and Customizable Spark Cluster on Amazon EMR.
amazon-emr-with-delta-lake
Amazon EMR Notebook to show how to read from and write to Delta tables with Amazon EMR
amazon-emr-cli
A command-line interface for packaging, deploying, and running your EMR Serverless Spark jobs
amazon-emr-vscode-toolkit
A VS Code Extension to make it easier to manage and develop Spark jobs on EMR