spark-on-aws-lambda
spark-on-aws-lambda copied to clipboard
Spark configuration for local deployment
Identify key configuration for Spark running local on a container. Adjust the JVM spin up cost ,maximize the memory capacity in AWS Lambda and reduce the container size.
- Best architecture to deploy a set of spark configuration to Script
- Decision at conf file or conf setting in the pyspark script. Ensure that the spark configuration is not overwritten
- JVM spin up, Memory/storage fraction, Serializer
- Caching and checkpointing to reduce memory foot print. Ensure that it spillover to disk instead of memory.
- Dynamically assign the CPU and memory based on the machine that AWS Lambda picks up.
d
Closing this