aws-glue-docker
aws-glue-docker copied to clipboard
🐋 Docker image for AWS Glue Spark/Python
Supported tags and respective Dockerfile
links
Simple Tags
Python Shell
Spark
You can use Python extension modules and libraries with your AWS Glue ETL scripts as long as they are written in pure Python. C libraries such as pandas are not supported at the present time, nor are extensions written in other languages.
-- AWS
Deprecated, please migrate to v3/v4
AWS Glue Docker
AWS Glue Development enviroment based on svajiraya/aws-glue-libs fix.
- Announced released bin '19
- Python Shell Supported Library
- Python Shell version running
- Glue lib reference
- Glue Dynamic frames
- Glue script samples
- Known Issues for AWS Glue
- packaged with: debian 10, OpenJDK 8, spark 2.4, maven 3.6, python 3.6, pip 20, pytest, glue lib, boto3
- additionally: aws cli, cdk, poetry
- Samples:
-
glue:
/opt/samples/glue
-
cdk:
/opt/samples/cdk
-
cloudformation:
/opt/samples/cloudformation
-
glue:
Getting started
# install docker and configure aliases
curl -sSL https://raw.githubusercontent.com/webysther/aws-glue-docker/master/start.sh | sh
# to use pandas
glue
# or pyspark
glue-spark
# here you are inside docker
# Glue PySpark (REPL)
pyspark
# Glue PySpark
# /app is you current folder
glue-spark sparksubmit /app/spark_script.py
# Test
glue pytest
# aliases inside docker (backwards compatibility)
gluesparksubmit == sparksubmit
gluepyspark == pyspark
gluepytest == pytest
License
MIT License. Please see License File for more information.