docker-airflow icon indicating copy to clipboard operation
docker-airflow copied to clipboard

airflow no such table: task_instance airflow running SparkSubmitOperator with another user

Open jorgeakanieves opened this issue 6 years ago • 8 comments

can´t execute the dag with another user than "airflow":

dag file:

operator = SparkSubmitOperator( ... ... run_as_user='root',

)

error:

[2018-09-16 09:30:27,083] {base_task_runner.py:98} INFO - Subtask: sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: task_instance [SQL: 'SELECT task_instance.try_number AS task_instance_try_number, task_instance.task_id AS task_instance_task_id, task_instance.dag_id AS task_instance_dag_id, task_instance.execution_date ....

jorgeakanieves avatar Sep 16 '18 09:09 jorgeakanieves

Just for confirmation !

(from my personal experience) you have done airflow initdb as mentioned in the Airflow Quick Start?

For the first time your Airflow is connecting to DB? - then you must do it. Please note that, if you are having you DB in a docker, you have to do it every-time it is created.

msampathkumar avatar Sep 20 '18 18:09 msampathkumar

Just for confirmation !

(from my personal experience) you have done airflow initdb as mentioned in the Airflow Quick Start?

For the first time your Airflow is connecting to DB? - then you must do it. Please note that, if you are having you DB in a docker, you have to do it every-time it is created.

Thanks for your help. Yes. This is the first thing done in entrypoint.sh for webserver container. Indeed this error dissapear in case that i run the dag with "run_as_user"="airflow"...

It´s something related to user "root"...This env vars are not set for user root: ENV AIRFLOW__CORE__SQL_ALCHEMY_CONN "postgresql+psycopg2://root:root@postgres:5432/airflow" ENV AIRFLOW__CELERY__CELERY_RESULT_BACKEND "db+postgresql://root:root@postgres:5432/airflow" ENV AIRFLOW__CORE__EXECUTOR "LocalExecutor"

And I couldn´t set them in the dag script (os.environ[""]...).

The fact is that inside the Dockerfile, user "airflow" is set. So container starts the script "entrypoint.sh" with this user, running first "initdb" and then "webserver". Then, when I use "root" user I guess that these env vars are not set or initdb it´s not ready for user "root".

Perhaps this repo is not ready to use another user than "airflow".

jorgeakanieves avatar Sep 21 '18 06:09 jorgeakanieves

I'm using airflow with dask scheduler + workers. I'm experiencing the same issue, the message get's printed in the dask workers logs:

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: task_instance

[SQL: SELECT task_instance.try_number AS task_instance_try_number, task_instance.task_id AS task_instance_task_id, task_instance.dag_id AS task_instance_dag_id, task_instance.execution_date AS task_instance_execution_date, task_instance.start_date AS task_instance_start_date, task_instance.end_date AS task_instance_end_date, task_instance.duration AS task_instance_duration, task_instance.state AS task_instance_state, task_instance.max_tries AS task_instance_max_tries, task_instance.hostname AS task_instance_hostname, task_instance.unixname AS task_instance_unixname, task_instance.job_id AS task_instance_job_id, task_instance.pool AS task_instance_pool, task_instance.queue AS task_instance_queue, task_instance.priority_weight AS task_instance_priority_weight, task_instance.operator AS task_instance_operator, task_instance.queued_dttm AS task_instance_queued_dttm, task_instance.pid AS task_instance_pid, task_instance.executor_config AS task_instance_executor_config

FROM task_instance

WHERE task_instance.dag_id = ? AND task_instance.task_id = ? AND task_instance.execution_date = ?

LIMIT ? OFFSET ?]

[parameters: ('example_bash_operator', 'runme_0', '2019-10-09 00:00:00.000000', 1, 0)]

(Background on this error at: http://sqlalche.me/e/e3q8)

distributed.worker - WARNING - Compute Failed

I am using postgres and it is configured correctly inside airflow.cfg. This example is the example_bash_operator.py dag which is not stating any user argument. It seems to be connected to missing ENV vars, should I put these as ENV in my dask workers?

chris-aeviator avatar Oct 10 '19 01:10 chris-aeviator

it appears to be related to initdb as commented before. Mine is related to the user set that does not include these ENV vars. Not sure if it´s the same issue....

jorgeakanieves avatar Oct 10 '19 11:10 jorgeakanieves

having the same issue

hzitoun avatar Nov 06 '19 10:11 hzitoun

Same, but it didn't appear until i started modifying a particular dag. It only happens for one dag, another executes just fine. I first tried deleting all db entries related to that DAG, and I even rebuilt the airflow DB, and the issue remains.

ctivanovich avatar Dec 02 '19 02:12 ctivanovich

try airflow initdb.

sreev avatar Oct 09 '20 18:10 sreev

Tried airflow db init for the new version of the airflow 2.0.2, has the same issue.

illia-sh avatar Apr 28 '21 09:04 illia-sh