docker-airflow
docker-airflow copied to clipboard
airflow no such table: task_instance airflow running SparkSubmitOperator with another user
can´t execute the dag with another user than "airflow":
dag file:
operator = SparkSubmitOperator( ... ... run_as_user='root',
)
error:
[2018-09-16 09:30:27,083] {base_task_runner.py:98} INFO - Subtask: sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: task_instance [SQL: 'SELECT task_instance.try_number AS task_instance_try_number, task_instance.task_id AS task_instance_task_id, task_instance.dag_id AS task_instance_dag_id, task_instance.execution_date ....
Just for confirmation !
(from my personal experience) you have done airflow initdb
as mentioned in the Airflow Quick Start?
For the first time your Airflow is connecting to DB? - then you must do it. Please note that, if you are having you DB in a docker, you have to do it every-time it is created.
Just for confirmation !
(from my personal experience) you have done
airflow initdb
as mentioned in the Airflow Quick Start?For the first time your Airflow is connecting to DB? - then you must do it. Please note that, if you are having you DB in a docker, you have to do it every-time it is created.
Thanks for your help. Yes. This is the first thing done in entrypoint.sh for webserver container. Indeed this error dissapear in case that i run the dag with "run_as_user"="airflow"...
It´s something related to user "root"...This env vars are not set for user root: ENV AIRFLOW__CORE__SQL_ALCHEMY_CONN "postgresql+psycopg2://root:root@postgres:5432/airflow" ENV AIRFLOW__CELERY__CELERY_RESULT_BACKEND "db+postgresql://root:root@postgres:5432/airflow" ENV AIRFLOW__CORE__EXECUTOR "LocalExecutor"
And I couldn´t set them in the dag script (os.environ[""]...).
The fact is that inside the Dockerfile, user "airflow" is set. So container starts the script "entrypoint.sh" with this user, running first "initdb" and then "webserver". Then, when I use "root" user I guess that these env vars are not set or initdb it´s not ready for user "root".
Perhaps this repo is not ready to use another user than "airflow".
I'm using airflow with dask scheduler + workers. I'm experiencing the same issue, the message get's printed in the dask workers logs:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: task_instance
[SQL: SELECT task_instance.try_number AS task_instance_try_number, task_instance.task_id AS task_instance_task_id, task_instance.dag_id AS task_instance_dag_id, task_instance.execution_date AS task_instance_execution_date, task_instance.start_date AS task_instance_start_date, task_instance.end_date AS task_instance_end_date, task_instance.duration AS task_instance_duration, task_instance.state AS task_instance_state, task_instance.max_tries AS task_instance_max_tries, task_instance.hostname AS task_instance_hostname, task_instance.unixname AS task_instance_unixname, task_instance.job_id AS task_instance_job_id, task_instance.pool AS task_instance_pool, task_instance.queue AS task_instance_queue, task_instance.priority_weight AS task_instance_priority_weight, task_instance.operator AS task_instance_operator, task_instance.queued_dttm AS task_instance_queued_dttm, task_instance.pid AS task_instance_pid, task_instance.executor_config AS task_instance_executor_config
FROM task_instance
WHERE task_instance.dag_id = ? AND task_instance.task_id = ? AND task_instance.execution_date = ?
LIMIT ? OFFSET ?]
[parameters: ('example_bash_operator', 'runme_0', '2019-10-09 00:00:00.000000', 1, 0)]
(Background on this error at: http://sqlalche.me/e/e3q8)
distributed.worker - WARNING - Compute Failed
I am using postgres and it is configured correctly inside airflow.cfg
. This example is the example_bash_operator.py
dag which is not stating any user argument. It seems to be connected to missing ENV vars, should I put these as ENV in my dask workers?
it appears to be related to initdb as commented before. Mine is related to the user set that does not include these ENV vars. Not sure if it´s the same issue....
having the same issue
Same, but it didn't appear until i started modifying a particular dag. It only happens for one dag, another executes just fine. I first tried deleting all db entries related to that DAG, and I even rebuilt the airflow DB, and the issue remains.
try airflow initdb
.
Tried airflow db init
for the new version of the airflow 2.0.2, has the same issue.