cwl-airflow
cwl-airflow copied to clipboard
run_as_user flag in default_args error
Hi, for security reason, I have to use Apache Airflow v2.3.3 to with cwl-airflow because I would like to use the flag "run_as_user" defined in "default_args". It is a feature that allows an Airflow task to be ran by another Unix-user. More details can refer here: airflow impersonation
So I face this error "PID of job runner does not match" when I tried to run a workflow in a docker container
scheduler | [2022-08-17 05:45:54,093] {scheduler_job.py:353} INFO - 1 tasks up for execution:
scheduler | <TaskInstance: 39_test1_1-my-workflow_test.CWLJobDispatcher manual__2022-08-17T05:45:50+00:00 [scheduled]>
scheduler | [2022-08-17 05:45:54,093] {scheduler_job.py:418} INFO - DAG 39_test1_1-my-workflow_test has 0/16 running and queued tasks
scheduler | [2022-08-17 05:45:54,094] {scheduler_job.py:504} INFO - Setting the following tasks to queued state:
scheduler | <TaskInstance: 39_test1_1-my-workflow_test.CWLJobDispatcher manual__2022-08-17T05:45:50+00:00 [scheduled]>
scheduler | [2022-08-17 05:45:54,097] {scheduler_job.py:546} INFO - Sending TaskInstanceKey(dag_id='39_test1_1-my-workflow_test', task_id='CWLJobDispatcher', run_id='manual__2022-08-17T05:45:50+00:00', try_number=1, map_index=-1) to executor with priority 3 and queue default
scheduler | [2022-08-17 05:45:54,097] {base_executor.py:91} INFO - Adding to queue: ['airflow', 'tasks', 'run', '39_test1_1-my-workflow_test', 'CWLJobDispatcher', 'manual__2022-08-17T05:45:50+00:00', '--local', '--subdir', 'DAGS_FOLDER/39_test1_1-my-workflow_test.py']
scheduler | [2022-08-17 05:45:54,100] {local_executor.py:79} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', '39_test1_1-my-workflow_test', 'CWLJobDispatcher', 'manual__2022-08-17T05:45:50+00:00', '--local', '--subdir', 'DAGS_FOLDER/39_test1_1-my-workflow_test.py']
scheduler | [2022-08-17 05:45:54,172] {dagbag.py:508} INFO - Filling up the DagBag from /home/kokleong/projects/root_perseus_app/cwl-airflow-dev-v3/dags/39_test1_1-my-workflow_test.py
scheduler | /usr/local/lib/python3.8/site-packages/airflow/configuration.py:528 DeprecationWarning: The sql_alchemy_conn option in [core] has been moved to the sql_alchemy_conn option in [database] - the old setting has been used, but please update your config.
scheduler | [2022-08-17 05:45:55,059] {task_command.py:371} INFO - Running <TaskInstance: 39_test1_1-my-workflow_test.CWLJobDispatcher manual__2022-08-17T05:45:50+00:00 [queued]> on host c318617f24d4
scheduler | [2022-08-17 05:46:01,689] {local_executor.py:128} ERROR - Failed to execute task PID of job runner does not match.
scheduler | Traceback (most recent call last):
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 124, in _execute_work_in_fork
scheduler | args.func(args)
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/cli/cli_parser.py", line 51, in command
scheduler | return func(*args, **kwargs)
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/utils/cli.py", line 99, in wrapper
scheduler | return f(*args, **kwargs)
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py", line 377, in task_run
scheduler | _run_task_by_selected_method(args, dag, ti)
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py", line 183, in _run_task_by_selected_method
scheduler | _run_task_by_local_task_job(args, ti)
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py", line 241, in _run_task_by_local_task_job
scheduler | run_job.run()
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/jobs/base_job.py", line 244, in run
scheduler | self._execute()
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/jobs/local_task_job.py", line 136, in _execute
scheduler | self.handle_task_exit(return_code)
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/jobs/base_job.py", line 225, in heartbeat
scheduler | self.heartbeat_callback(session=session)
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/utils/session.py", line 68, in wrapper
scheduler | return func(*args, **kwargs)
scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/jobs/local_task_job.py", line 211, in heartbeat_callback
scheduler | "Recorded pid %s does not match the current pid %s", recorded_pid, current_pid
scheduler | airflow.exceptions.AirflowException: PID of job runner does not match
scheduler | [2022-08-17 05:46:01,907] {scheduler_job.py:599} INFO - Executor reports execution of 39_test1_1-my-workflow_test.CWLJobDispatcher run_id=manual__2022-08-17T05:45:50+00:00 exited with status failed for try_number 1
scheduler | [2022-08-17 05:46:01,912] {scheduler_job.py:642} INFO - TaskInstance Finished: dag_id=39_test1_1-my-workflow_test, task_id=CWLJobDispatcher, run_id=manual__2022-08-17T05:45:50+00:00, map_index=-1, run_start_date=2022-08-17 05:45:55.369221+00:00, run_end_date=2022-08-17 05:46:00.979434+00:00, run_duration=5.610213, state=failed, executor_state=failed, try_number=1, max_tries=0, job_id=57, pool=default_pool, queue=default, priority_weight=3, operator=CWLJobDispatcher, queued_dttm=2022-08-17 05:45:54.095112+00:00, queued_by_job_id=50, pid=618
scheduler | [2022-08-17 05:46:02,949] {dagrun.py:549} ERROR - Marking run <DagRun 39_test1_1-my-workflow_test @ 2022-08-17 05:45:50+00:00: manual__2022-08-17T05:45:50+00:00, externally triggered: True> failed
scheduler | [2022-08-17 05:46:02,949] {dagrun.py:609} INFO - DagRun Finished: dag_id=39_test1_1-my-workflow_test, execution_date=2022-08-17 05:45:50+00:00, run_id=manual__2022-08-17T05:45:50+00:00, run_start_date=2022-08-17 05:45:54.058559+00:00, run_end_date=2022-08-17 05:46:02.949602+00:00, run_duration=8.891043, state=failed, external_trigger=True, run_type=manual, data_interval_start=2022-08-17 05:45:50+00:00, data_interval_end=2022-08-17 05:45:50+00:00, dag_hash=8366f942b3d5bba361f6640b7d2ae180
I do not face this error when I tried to run the same workflow in my local host though.
Below are some of the info that I think might be helpful to resolve the issue. My list of Python packages:
Package Version
----------------------------------- ------------------
alembic 1.8.1
anyio 3.6.1
apache-airflow 2.3.3
apache-airflow-providers-common-sql 1.0.0
apache-airflow-providers-ftp 3.1.0
apache-airflow-providers-http 4.0.0
apache-airflow-providers-imap 3.0.0
apache-airflow-providers-sqlite 3.2.0
apispec 3.3.2
argcomplete 2.0.0
attrs 20.3.0
Babel 2.10.3
bagit 1.8.1
blinker 1.5
CacheControl 0.12.11
cachelib 0.9.0
cattrs 1.10.0
certifi 2022.6.15
cffi 1.15.1
charset-normalizer 2.1.0
click 8.1.3
clickclick 20.10.2
colorama 0.4.5
coloredlogs 15.0.1
colorlog 4.8.0
commonmark 0.9.1
connexion 2.14.0
cron-descriptor 1.2.31
croniter 1.3.5
cryptography 37.0.4
cwl-airflow 1.2.11
cwltest 2.1.20210626101542
cwltool 3.1.20210816212154
defusedxml 0.7.1
Deprecated 1.2.13
dill 0.3.5.1
dnspython 2.2.1
docker 5.0.3
docutils 0.19
email-validator 1.2.1
Flask 2.2.2
Flask-AppBuilder 4.1.2
Flask-Babel 2.0.0
Flask-Caching 2.0.1
Flask-JWT-Extended 4.4.3
Flask-Login 0.6.2
Flask-Session 0.4.0
Flask-SQLAlchemy 2.5.1
Flask-WTF 0.15.1
graphviz 0.20.1
greenlet 1.1.2
gunicorn 20.1.0
h11 0.12.0
httpcore 0.15.0
httpx 0.23.0
humanfriendly 10.0
idna 3.3
importlib-metadata 4.12.0
importlib-resources 5.9.0
inflection 0.5.1
isodate 0.6.1
itsdangerous 2.1.2
Jinja2 3.1.2
jsonmerge 1.8.0
jsonschema 4.9.1
junit-xml 1.9
lazy-object-proxy 1.7.1
linkify-it-py 2.0.0
lockfile 0.12.2
lxml 4.9.1
Mako 1.2.1
Markdown 3.4.1
markdown-it-py 2.1.0
MarkupSafe 2.1.1
marshmallow 3.17.0
marshmallow-enum 1.5.1
marshmallow-oneofschema 3.0.1
marshmallow-sqlalchemy 0.26.1
mdit-py-plugins 0.3.0
mdurl 0.1.2
mistune 0.8.4
msgpack 1.0.4
mypy-extensions 0.4.3
networkx 2.8.5
packaging 21.3
pathspec 0.9.0
pendulum 2.1.2
pip 22.2.2
pkgutil_resolve_name 1.3.10
pluggy 1.0.0
prison 0.2.1
prov 1.5.1
psutil 5.9.1
psycopg2 2.9.3
pycparser 2.21
pydot 1.4.2
Pygments 2.12.0
PyJWT 2.4.0
pyparsing 3.0.9
pyrsistent 0.18.1
python-daemon 2.3.1
python-dateutil 2.8.2
python-nvd3 0.15.0
python-slugify 6.1.2
pytz 2022.2.1
pytzdata 2020.1
PyYAML 6.0
rdflib 6.0.2
requests 2.28.1
requests-toolbelt 0.9.1
rfc3986 1.5.0
rich 12.5.1
ruamel.yaml 0.17.10
ruamel.yaml.clib 0.2.6
schema-salad 8.3.20220801194920
setproctitle 1.3.2
setuptools 56.0.0
shellescape 3.8.1
six 1.16.0
sniffio 1.2.0
SQLAlchemy 1.4.40
SQLAlchemy-JSONField 1.0.0
SQLAlchemy-Utils 0.38.3
swagger-ui-bundle 0.0.9
tabulate 0.8.10
tenacity 8.0.1
termcolor 1.1.0
text-unidecode 1.3
tornado 6.2
typing_extensions 4.3.0
uc-micro-py 1.0.1
unicodecsv 0.14.1
urllib3 1.26.11
websocket-client 1.3.3
Werkzeug 2.2.2
wrapt 1.14.1
WTForms 2.3.3
zipp 3.8.1
For the Dockerfile and docker-compose file, I used your templates, with slight modification to it. Below are some of the important info:
ARG UBUNTU_VERSION="18.04"
ARG PYTHON_VERSION="3.8.12"
ARG CWL_AIRFLOW_VERSION="1.2.11"
I am suspecting that this issue is caused by myself running a workflow in a docker container. So far I have not seen anyone mentioned about this issue in the Airflow github.
Hi @kokleong9406,
I think it can be related to running CWL-Airflow inside docker. For docker run
there is a --user
parameter. I believe something similar can be provided in the docker-compose file.
Let me know if this information was useful