docker-airflow
docker-airflow copied to clipboard
Broken DAG: [/usr/local/airflow/dags/upload.py] No module named 'botocore' URGENT!!!!
Hello,
I followed this article to install Airflow on our EC2 instance.
https://www.qubole.com/tech-blog/how-to-install-apache-airflow-to-run-different-executors/
I have built the container with the following command and then run the docker compose command as follows;
docker build --rm --build-arg AIRFLOW_DEPS="mssql,aws" --build-arg PYTHON_DEPS="botocore>=1.4.1" -t puckel/docker-airflow .
docker-compose -f docker-compose-CeleryExecutor.yml up -d
But when we create a new dag it is having problem either in the import module or the error comes as
Broken DAG: [/usr/local/airflow/dags/upload.py] No module named 'botocore' Broken DAG: [/usr/local/airflow/dags/upload.py] No module named 'boto3'
I have also tried running the python script using bashoperator but while running the python script from airflow it gives error No module import boto3.
When I run the same script manually inside the container, it is working fine not giving error related to module import.
I also tested the following commands inside container and it is not giving any error related to module import
python import boto3 import botocore
This has cracked our mind for several hours and we are not able to understand what's wrong.
Please find attached the little of docker build output along with the one python script to download the file from s3. We have also used airflow s3 hook operator as well as boto3 library.
def download_file(bucket, key, destination): import boto3 s3 = boto3.resource('s3') s3.meta.client.download_file(moso-dba-scripts, key, destination)
Let us know if there is any mistake made in the build of the docker compose or there is something else. We have been hammering our head for last 24-48 hours.
Docker-Compose-BuildOutput.txt s3Download-Upload.txt s3-hook.txt Run-Python-Script.txt
Any update on this?
I had the same problem, for me, it worked by pip installing in the Dockerfile directly:
FROM puckel/docker-airflow:1.10.7
USER root
COPY entrypoint.sh /entrypoint.sh
COPY credentials /credentials
COPY requirements.txt /requirements.txt
COPY airflow.cfg /usr/local/airflow/airflow.cfg
RUN apt-get update && apt-get -y install jq python python-pip vim \
&& pip install --upgrade pip && pip install --upgrade awscli && pip install -r /requirements.txt
RUN ["chmod", "+x", "/entrypoint.sh"]
RUN usermod -d /home/airflow airflow \
&& mkdir -p /home/airflow/.aws \
&& cp /credentials /home/airflow/.aws/
RUN ["chown", "-R", "airflow", "/home/airflow"]
USER airflow
ENTRYPOINT ["/entrypoint.sh"]
# # Expose webUI and flower respectively
EXPOSE 8080
EXPOSE 5555
Note: this is a Dockerfile for dev purposes only since credentials are registered into the image for the lack of context.
And changing my .yml to:
worker:
build:
context: .
dockerfile: Dockerfile
# image: puckel/docker-airflow:1.10.7
restart: always
depends_on:
- scheduler
volumes:
- ./dags:/usr/local/airflow/dags
# Uncomment to include custom plugins
# - ./plugins:/usr/local/airflow/plugins
environment:
- FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
- EXECUTOR=Celery
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
# - REDIS_PASSWORD=redispass
command: worker
Still haven't investigated why it failed, although logs reported that the dependencies had been successfully installed. I accessed the machine and the dependencies were indeed installed. My guessing goes around the user that install such packages and the user that tries to execute them.
I fixed it by adding the requirements.txt and attach it to workers in the docker-compose-CeleryExecutor.yml like below as jobs will be run in worker
worker:
image: puckel/docker-airflow:1.10.9
restart: always
depends_on:
- scheduler
volumes:
- ./dags:/usr/local/airflow/dags
- ./requirements.txt:/requirements.txt
# Uncomment to include custom plugins
# - ./plugins:/usr/local/airflow/plugins
environment:
- FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
- EXECUTOR=Celery
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
- REDIS_PASSWORD=redispass
command: worker
dont know if its the right way but it works for me without editing the docker file.
@muthu1086 I think this is the right way, as explained here.
Any suggestion how to fix this
File "server.py", line 6, in