run_as_user flag in default_args error

Open kokleong9406 opened this issue 2 years ago • 1 comments

Hi, for security reason, I have to use Apache Airflow v2.3.3 to with cwl-airflow because I would like to use the flag "run_as_user" defined in "default_args". It is a feature that allows an Airflow task to be ran by another Unix-user. More details can refer here: airflow impersonation

So I face this error "PID of job runner does not match" when I tried to run a workflow in a docker container

scheduler    | [2022-08-17 05:45:54,093] {} INFO - 1 tasks up for execution:
scheduler    | 	<TaskInstance: 39_test1_1-my-workflow_test.CWLJobDispatcher manual__2022-08-17T05:45:50+00:00 [scheduled]>
scheduler    | [2022-08-17 05:45:54,093] {} INFO - DAG 39_test1_1-my-workflow_test has 0/16 running and queued tasks
scheduler    | [2022-08-17 05:45:54,094] {} INFO - Setting the following tasks to queued state:
scheduler    | 	<TaskInstance: 39_test1_1-my-workflow_test.CWLJobDispatcher manual__2022-08-17T05:45:50+00:00 [scheduled]>
scheduler    | [2022-08-17 05:45:54,097] {} INFO - Sending TaskInstanceKey(dag_id='39_test1_1-my-workflow_test', task_id='CWLJobDispatcher', run_id='manual__2022-08-17T05:45:50+00:00', try_number=1, map_index=-1) to executor with priority 3 and queue default
scheduler    | [2022-08-17 05:45:54,097] {} INFO - Adding to queue: ['airflow', 'tasks', 'run', '39_test1_1-my-workflow_test', 'CWLJobDispatcher', 'manual__2022-08-17T05:45:50+00:00', '--local', '--subdir', 'DAGS_FOLDER/']
scheduler    | [2022-08-17 05:45:54,100] {} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', '39_test1_1-my-workflow_test', 'CWLJobDispatcher', 'manual__2022-08-17T05:45:50+00:00', '--local', '--subdir', 'DAGS_FOLDER/']
scheduler    | [2022-08-17 05:45:54,172] {} INFO - Filling up the DagBag from /home/kokleong/projects/root_perseus_app/cwl-airflow-dev-v3/dags/
scheduler    | /usr/local/lib/python3.8/site-packages/airflow/ DeprecationWarning: The sql_alchemy_conn option in [core] has been moved to the sql_alchemy_conn option in [database] - the old setting has been used, but please update your config.
scheduler    | [2022-08-17 05:45:55,059] {} INFO - Running <TaskInstance: 39_test1_1-my-workflow_test.CWLJobDispatcher manual__2022-08-17T05:45:50+00:00 [queued]> on host c318617f24d4
scheduler    | [2022-08-17 05:46:01,689] {} ERROR - Failed to execute task PID of job runner does not match.
scheduler    | Traceback (most recent call last):
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/executors/", line 124, in _execute_work_in_fork
scheduler    |     args.func(args)
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/cli/", line 51, in command
scheduler    |     return func(*args, **kwargs)
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/utils/", line 99, in wrapper
scheduler    |     return f(*args, **kwargs)
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/cli/commands/", line 377, in task_run
scheduler    |     _run_task_by_selected_method(args, dag, ti)
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/cli/commands/", line 183, in _run_task_by_selected_method
scheduler    |     _run_task_by_local_task_job(args, ti)
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/cli/commands/", line 241, in _run_task_by_local_task_job
scheduler    |
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/jobs/", line 244, in run
scheduler    |     self._execute()
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/jobs/", line 136, in _execute
scheduler    |     self.handle_task_exit(return_code)
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/jobs/", line 225, in heartbeat
scheduler    |     self.heartbeat_callback(session=session)
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/utils/", line 68, in wrapper
scheduler    |     return func(*args, **kwargs)
scheduler    |   File "/usr/local/lib/python3.8/site-packages/airflow/jobs/", line 211, in heartbeat_callback
scheduler    |     "Recorded pid %s does not match the current pid %s", recorded_pid, current_pid
scheduler    | airflow.exceptions.AirflowException: PID of job runner does not match
scheduler    | [2022-08-17 05:46:01,907] {} INFO - Executor reports execution of 39_test1_1-my-workflow_test.CWLJobDispatcher run_id=manual__2022-08-17T05:45:50+00:00 exited with status failed for try_number 1
scheduler    | [2022-08-17 05:46:01,912] {} INFO - TaskInstance Finished: dag_id=39_test1_1-my-workflow_test, task_id=CWLJobDispatcher, run_id=manual__2022-08-17T05:45:50+00:00, map_index=-1, run_start_date=2022-08-17 05:45:55.369221+00:00, run_end_date=2022-08-17 05:46:00.979434+00:00, run_duration=5.610213, state=failed, executor_state=failed, try_number=1, max_tries=0, job_id=57, pool=default_pool, queue=default, priority_weight=3, operator=CWLJobDispatcher, queued_dttm=2022-08-17 05:45:54.095112+00:00, queued_by_job_id=50, pid=618
scheduler    | [2022-08-17 05:46:02,949] {} ERROR - Marking run <DagRun 39_test1_1-my-workflow_test @ 2022-08-17 05:45:50+00:00: manual__2022-08-17T05:45:50+00:00, externally triggered: True> failed
scheduler    | [2022-08-17 05:46:02,949] {} INFO - DagRun Finished: dag_id=39_test1_1-my-workflow_test, execution_date=2022-08-17 05:45:50+00:00, run_id=manual__2022-08-17T05:45:50+00:00, run_start_date=2022-08-17 05:45:54.058559+00:00, run_end_date=2022-08-17 05:46:02.949602+00:00, run_duration=8.891043, state=failed, external_trigger=True, run_type=manual, data_interval_start=2022-08-17 05:45:50+00:00, data_interval_end=2022-08-17 05:45:50+00:00, dag_hash=8366f942b3d5bba361f6640b7d2ae180

I do not face this error when I tried to run the same workflow in my local host though.

Below are some of the info that I think might be helpful to resolve the issue. My list of Python packages:

Package                             Version
----------------------------------- ------------------
alembic                             1.8.1
anyio                               3.6.1
apache-airflow                      2.3.3
apache-airflow-providers-common-sql 1.0.0
apache-airflow-providers-ftp        3.1.0
apache-airflow-providers-http       4.0.0
apache-airflow-providers-imap       3.0.0
apache-airflow-providers-sqlite     3.2.0
apispec                             3.3.2
argcomplete                         2.0.0
attrs                               20.3.0
Babel                               2.10.3
bagit                               1.8.1
blinker                             1.5
CacheControl                        0.12.11
cachelib                            0.9.0
cattrs                              1.10.0
certifi                             2022.6.15
cffi                                1.15.1
charset-normalizer                  2.1.0
click                               8.1.3
clickclick                          20.10.2
colorama                            0.4.5
coloredlogs                         15.0.1
colorlog                            4.8.0
commonmark                          0.9.1
connexion                           2.14.0
cron-descriptor                     1.2.31
croniter                            1.3.5
cryptography                        37.0.4
cwl-airflow                         1.2.11
cwltest                             2.1.20210626101542
cwltool                             3.1.20210816212154
defusedxml                          0.7.1
Deprecated                          1.2.13
dnspython                           2.2.1
docker                              5.0.3
docutils                            0.19
email-validator                     1.2.1
Flask                               2.2.2
Flask-AppBuilder                    4.1.2
Flask-Babel                         2.0.0
Flask-Caching                       2.0.1
Flask-JWT-Extended                  4.4.3
Flask-Login                         0.6.2
Flask-Session                       0.4.0
Flask-SQLAlchemy                    2.5.1
Flask-WTF                           0.15.1
graphviz                            0.20.1
greenlet                            1.1.2
gunicorn                            20.1.0
h11                                 0.12.0
httpcore                            0.15.0
httpx                               0.23.0
humanfriendly                       10.0
idna                                3.3
importlib-metadata                  4.12.0
importlib-resources                 5.9.0
inflection                          0.5.1
isodate                             0.6.1
itsdangerous                        2.1.2
Jinja2                              3.1.2
jsonmerge                           1.8.0
jsonschema                          4.9.1
junit-xml                           1.9
lazy-object-proxy                   1.7.1
linkify-it-py                       2.0.0
lockfile                            0.12.2
lxml                                4.9.1
Mako                                1.2.1
Markdown                            3.4.1
markdown-it-py                      2.1.0
MarkupSafe                          2.1.1
marshmallow                         3.17.0
marshmallow-enum                    1.5.1
marshmallow-oneofschema             3.0.1
marshmallow-sqlalchemy              0.26.1
mdit-py-plugins                     0.3.0
mdurl                               0.1.2
mistune                             0.8.4
msgpack                             1.0.4
mypy-extensions                     0.4.3
networkx                            2.8.5
packaging                           21.3
pathspec                            0.9.0
pendulum                            2.1.2
pip                                 22.2.2
pkgutil_resolve_name                1.3.10
pluggy                              1.0.0
prison                              0.2.1
prov                                1.5.1
psutil                              5.9.1
psycopg2                            2.9.3
pycparser                           2.21
pydot                               1.4.2
Pygments                            2.12.0
PyJWT                               2.4.0
pyparsing                           3.0.9
pyrsistent                          0.18.1
python-daemon                       2.3.1
python-dateutil                     2.8.2
python-nvd3                         0.15.0
python-slugify                      6.1.2
pytz                                2022.2.1
pytzdata                            2020.1
PyYAML                              6.0
rdflib                              6.0.2
requests                            2.28.1
requests-toolbelt                   0.9.1
rfc3986                             1.5.0
rich                                12.5.1
ruamel.yaml                         0.17.10
ruamel.yaml.clib                    0.2.6
schema-salad                        8.3.20220801194920
setproctitle                        1.3.2
setuptools                          56.0.0
shellescape                         3.8.1
six                                 1.16.0
sniffio                             1.2.0
SQLAlchemy                          1.4.40
SQLAlchemy-JSONField                1.0.0
SQLAlchemy-Utils                    0.38.3
swagger-ui-bundle                   0.0.9
tabulate                            0.8.10
tenacity                            8.0.1
termcolor                           1.1.0
text-unidecode                      1.3
tornado                             6.2
typing_extensions                   4.3.0
uc-micro-py                         1.0.1
unicodecsv                          0.14.1
urllib3                             1.26.11
websocket-client                    1.3.3
Werkzeug                            2.2.2
wrapt                               1.14.1
WTForms                             2.3.3
zipp                                3.8.1

For the Dockerfile and docker-compose file, I used your templates, with slight modification to it. Below are some of the important info:


I am suspecting that this issue is caused by myself running a workflow in a docker container. So far I have not seen anyone mentioned about this issue in the Airflow github.

kokleong9406 avatar Aug 17 '22 06:08 kokleong9406

Hi @kokleong9406,

I think it can be related to running CWL-Airflow inside docker. For docker run there is a --user parameter. I believe something similar can be provided in the docker-compose file.

Let me know if this information was useful

michael-kotliar avatar Sep 12 '22 20:09 michael-kotliar