sagerx
sagerx copied to clipboard
PIP additional requirements throws Docker error
Problem Statement
Right now, we have CMS Part D DAG in a "hidden_dags" folder because it was causing an error for me and at least one other person when trying to docker-compose up airflow-init.
Also - I have another branch with a MEPS DAG (jrlegrand/meps) that I haven't merged because it also has a dependency on a PIP package.
So figuring out the way to handle this error would unlock two very useful DAGs.
Assuming the answer is to build a custom image like the error message says... https://airflow.apache.org/docs/docker-stack/build.html
airflow-init | !!!!! Installing additional requirements: 'zipfile-deflate64' !!!!!!!!!!!!
airflow-init |
airflow-init | WARNING: This is a development/test feature only. NEVER use it in production!
airflow-init | Instead, build a custom image as described in
airflow-init |
airflow-init | https://airflow.apache.org/docs/docker-stack/build.html
Criteria for Success
Figure out a way to safely and correctly handle PIP dependencies.
Additional Information
This is what I had to do to fix the error (basically crippling the DAG that depended on this PIP package).
This is the Slack convo with Adam G. https://coderx.slack.com/archives/C05S27E52N8/p1703102061867119
Full error log
</details>
C:\Dev\sagerx>docker-compose up airflow-init
time="2023-12-20T15:21:08-05:00" level=warning msg="The \"UMLS_API\" variable is not set. Defaulting to a blank string."time="2023-12-20T15:21:08-05:00" level=warning msg="The \"UMLS_API\" variable is not set. Defaulting to a blank string."time="2023-12-20T15:21:08-05:00" level=warning msg="The \"UMLS_API\" variable is not set. Defaulting to a blank string."time="2023-12-20T15:21:08-05:00" level=warning msg="The \"UMLS_API\" variable is not set. Defaulting to a blank string."[+] Running 2/0
✔ Container postgres Running 0.0s
✔ Container airflow-init Created 0.0s
Attaching to airflow-init, postgres
airflow-init | The container is run as root user. For security, consider using a regular user account.
airflow-init |
airflow-init |
airflow-init | /home/airflow/.local/lib/python3.7/site-packages/airflow/models/base.py:49 MovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to "sqlalchemy<2.0". Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings. Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
airflow-init | DB: postgresql+psycopg2://airflow:***@postgres:5432/airflow
airflow-init | Performing upgrade with database postgresql+psycopg2://airflow:***@postgres:5432/airflow
airflow-init | [2023-12-20 20:21:10,566] {migration.py:205} INFO - Context impl PostgresqlImpl.
airflow-init | [2023-12-20 20:21:10,567] {migration.py:212} INFO - Will assume transactional DDL.
airflow-init | [2023-12-20 20:21:10,572] {db.py:1571} INFO - Creating tables
airflow-init | INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
airflow-init | INFO [alembic.runtime.migration] Will assume transactional DDL.
airflow-init | Upgrades done
airflow-init | [2023-12-20 20:21:13,342] {providers_manager.py:238} INFO - Optional provider feature disabled when importing 'airflow.providers.google.leveldb.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package
airflow-init | [2023-12-20 20:21:13,666] {providers_manager.py:238} INFO - Optional provider feature disabled when importing 'airflow.providers.google.leveldb.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package
airflow-init | airflow already exist in the db
airflow-init |
airflow-init | !!!!! Installing additional requirements: 'zipfile-deflate64' !!!!!!!!!!!!
airflow-init |
airflow-init | WARNING: This is a development/test feature only. NEVER use it in production!
airflow-init | Instead, build a custom image as described in
airflow-init |
airflow-init | https://airflow.apache.org/docs/docker-stack/build.html
airflow-init |
airflow-init | Adding requirements at container startup is fragile and is done every time
airflow-init | the container starts, so it is onlny useful for testing and trying out
airflow-init | of adding dependencies.
airflow-init |
airflow-init |
airflow-init |
airflow-init | You are running pip as root. Please use 'airflow' user to run pip!
airflow-init |
airflow-init | See: https://airflow.apache.org/docs/docker-stack/build.html#adding-a-new-pypi-package
airflow-init |
airflow-init |
airflow-init exited with code 1