docker-airflow icon indicating copy to clipboard operation
docker-airflow copied to clipboard

Broken DAG: [/usr/local/airflow/dags/upload.py] No module named 'botocore' URGENT!!!!

Open bheesham-devops opened this issue 5 years ago • 5 comments

Hello,

I followed this article to install Airflow on our EC2 instance.

https://www.qubole.com/tech-blog/how-to-install-apache-airflow-to-run-different-executors/

I have built the container with the following command and then run the docker compose command as follows;

docker build --rm --build-arg AIRFLOW_DEPS="mssql,aws" --build-arg PYTHON_DEPS="botocore>=1.4.1" -t puckel/docker-airflow .

docker-compose -f docker-compose-CeleryExecutor.yml up -d

But when we create a new dag it is having problem either in the import module or the error comes as

Broken DAG: [/usr/local/airflow/dags/upload.py] No module named 'botocore' Broken DAG: [/usr/local/airflow/dags/upload.py] No module named 'boto3'

I have also tried running the python script using bashoperator but while running the python script from airflow it gives error No module import boto3.

When I run the same script manually inside the container, it is working fine not giving error related to module import.

I also tested the following commands inside container and it is not giving any error related to module import

python import boto3 import botocore

This has cracked our mind for several hours and we are not able to understand what's wrong.

Please find attached the little of docker build output along with the one python script to download the file from s3. We have also used airflow s3 hook operator as well as boto3 library.

def download_file(bucket, key, destination): import boto3 s3 = boto3.resource('s3') s3.meta.client.download_file(moso-dba-scripts, key, destination)

Let us know if there is any mistake made in the build of the docker compose or there is something else. We have been hammering our head for last 24-48 hours.

Docker-Compose-BuildOutput.txt s3Download-Upload.txt s3-hook.txt Run-Python-Script.txt

bheesham-devops avatar Oct 01 '20 19:10 bheesham-devops

Any update on this?

bheesham-devops avatar Oct 05 '20 12:10 bheesham-devops

I had the same problem, for me, it worked by pip installing in the Dockerfile directly:

FROM puckel/docker-airflow:1.10.7

USER root

COPY entrypoint.sh /entrypoint.sh
COPY credentials /credentials
COPY requirements.txt /requirements.txt
COPY airflow.cfg /usr/local/airflow/airflow.cfg

RUN apt-get update && apt-get -y install jq python python-pip vim \
    && pip install --upgrade pip && pip install --upgrade awscli && pip install -r /requirements.txt

RUN ["chmod", "+x", "/entrypoint.sh"]

RUN usermod -d /home/airflow airflow \
&& mkdir -p /home/airflow/.aws \
&& cp /credentials /home/airflow/.aws/

RUN ["chown", "-R", "airflow", "/home/airflow"]

USER airflow

ENTRYPOINT ["/entrypoint.sh"]

# # Expose webUI and flower respectively
EXPOSE 8080
EXPOSE 5555

Note: this is a Dockerfile for dev purposes only since credentials are registered into the image for the lack of context.

And changing my .yml to:

    worker:
        build:
            context: .
            dockerfile: Dockerfile
        # image: puckel/docker-airflow:1.10.7
        restart: always
        depends_on:
            - scheduler
        volumes:
            - ./dags:/usr/local/airflow/dags
            # Uncomment to include custom plugins
            # - ./plugins:/usr/local/airflow/plugins
        environment:
            - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
            - EXECUTOR=Celery
            - POSTGRES_USER=airflow
            - POSTGRES_PASSWORD=airflow
            - POSTGRES_DB=airflow
            # - REDIS_PASSWORD=redispass
        command: worker

Still haven't investigated why it failed, although logs reported that the dependencies had been successfully installed. I accessed the machine and the dependencies were indeed installed. My guessing goes around the user that install such packages and the user that tries to execute them.

toderesa97 avatar Nov 30 '20 18:11 toderesa97

I fixed it by adding the requirements.txt and attach it to workers in the docker-compose-CeleryExecutor.yml like below as jobs will be run in worker

worker:
        image: puckel/docker-airflow:1.10.9
        restart: always
        depends_on:
            - scheduler
        volumes:
            - ./dags:/usr/local/airflow/dags
            - ./requirements.txt:/requirements.txt
            # Uncomment to include custom plugins
            # - ./plugins:/usr/local/airflow/plugins
        environment:
            - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
            - EXECUTOR=Celery
            - POSTGRES_USER=airflow
            - POSTGRES_PASSWORD=airflow
            - POSTGRES_DB=airflow
            - REDIS_PASSWORD=redispass
        command: worker

dont know if its the right way but it works for me without editing the docker file.

muthu1086 avatar Dec 30 '20 11:12 muthu1086

@muthu1086 I think this is the right way, as explained here.

jakubberezowski avatar Jan 19 '21 09:01 jakubberezowski

Any suggestion how to fix this

File "server.py", line 6, in from dags_database import DAGsDatabase ModuleNotFoundError: No module named 'dags_database'

jahidhasanlinix avatar Dec 12 '21 02:12 jahidhasanlinix