airflow icon indicating copy to clipboard operation
airflow copied to clipboard

helm install airflow in namespace get error: File "<string>", line 32, in <module> TimeoutError: There are still unapplied migrations after 60 seconds

Open patsevanton opened this issue 4 years ago • 34 comments

Apache Airflow version: master git

Kubernetes version (if you are using kubernetes) (use kubectl version):


Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.17", 

Environment:

  • Cloud provider or hardware configuration: Microk8s
  • OS (e.g. from /etc/os-release): VERSION="18.04.3 LTS (Bionic Beaver)"

What happened:

git clone https://github.com/apache/airflow.git
cd airflow/chart/
helm dependency update
kubectl create namespace xxxxx
werf helm install --wait --set webserver.defaultUser.password=password,ingress.enabled=true,ingress.hosts[0]=airflow.192.168.22.7.xip.io --namespace xxxxx airflow ./

Log

│ ┌ deploy/airflow-webserver po/airflow-webserver-86857b5969-sqkv6 container/wait-for-airflow-migrations logs
│ │ [2021-04-13 05:57:20,571] {<string>:35} INFO - Waiting for migrations... 60 second(s)
│ │ Traceback (most recent call last):
│ │   File "<string>", line 32, in <module>
│ │ TimeoutError: There are still unapplied migrations after 60 seconds.
│ └ deploy/airflow-webserver po/airflow-webserver-86857b5969-sqkv6 container/wait-for-airflow-migrations logs

Next line log

│ deploy/airflow-scheduler ERROR: po/airflow-scheduler-658d5d4454-r2sgl container/wait-for-airflow-migrations: CrashLoopBackOff: back-off 10s restarting failed container=wait-for-airflow-migrations             ↵
│ pod=airflow-scheduler-658d5d4454-r2sgl_sdpcc(40e85057-2aa5-4e9e-a47d-e91530038c0c)
│ 1/1 allowed errors occurred for deploy/airflow-scheduler: continue tracking

Full log https://gist.github.com/patsevanton/0edd5571cf69aa539edcdb803c288061

patsevanton avatar Apr 13 '21 06:04 patsevanton

kubectl logs -n xxxxx airflow-webserver-86857b5969-sqkv6 Error from server (BadRequest): container "webserver" in pod "airflow-webserver-86857b5969-sqkv6" is waiting to start: PodInitializing

patsevanton avatar Apr 13 '21 06:04 patsevanton

kubectl logs -n xxxxx airflow-postgresql-0

postgresql 05:56:01.18
postgresql 05:56:01.18 Welcome to the Bitnami postgresql container
postgresql 05:56:01.18 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-postgresql
postgresql 05:56:01.18 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-postgresql/issues
postgresql 05:56:01.18 Send us your feedback at [email protected]
postgresql 05:56:01.19
postgresql 05:56:01.20 INFO  ==> ** Starting PostgreSQL setup **
postgresql 05:56:01.23 INFO  ==> Validating settings in POSTGRESQL_* env vars..
postgresql 05:56:01.24 INFO  ==> Loading custom pre-init scripts...
postgresql 05:56:01.24 INFO  ==> Initializing PostgreSQL database...
postgresql 05:56:01.25 INFO  ==> postgresql.conf file not detected. Generating it...
postgresql 05:56:01.25 INFO  ==> pg_hba.conf file not detected. Generating it...
postgresql 05:56:02.32 INFO  ==> Starting PostgreSQL in background...
postgresql 05:56:02.44 INFO  ==> Changing password of postgres
postgresql 05:56:02.45 INFO  ==> Configuring replication parameters
postgresql 05:56:02.47 INFO  ==> Configuring fsync
postgresql 05:56:02.47 INFO  ==> Loading custom scripts...
postgresql 05:56:02.48 INFO  ==> Enabling remote connections
postgresql 05:56:02.48 INFO  ==> Stopping PostgreSQL...
postgresql 05:56:03.49 INFO  ==> ** PostgreSQL setup finished! **

postgresql 05:56:03.52 INFO  ==> ** Starting PostgreSQL **
2021-04-13 05:56:03.537 GMT [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2021-04-13 05:56:03.537 GMT [1] LOG:  listening on IPv6 address "::", port 5432
2021-04-13 05:56:03.556 GMT [1] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-04-13 05:56:03.586 GMT [178] LOG:  database system was shut down at 2021-04-13 05:56:02 GMT
2021-04-13 05:56:03.596 GMT [1] LOG:  database system is ready to accept connections
2021-04-13 05:56:10.476 GMT [193] LOG:  incomplete startup packet
2021-04-13 05:56:12.106 GMT [194] LOG:  incomplete startup packet
2021-04-13 05:57:20.415 GMT [284] LOG:  incomplete startup packet
2021-04-13 05:57:22.399 GMT [286] LOG:  incomplete startup packet
2021-04-13 05:58:44.731 GMT [397] LOG:  incomplete startup packet
2021-04-13 05:58:45.741 GMT [398] LOG:  incomplete startup packet
2021-04-13 06:00:17.733 GMT [533] LOG:  incomplete startup packet
2021-04-13 06:00:18.752 GMT [534] LOG:  incomplete startup packet
2021-04-13 06:02:18.723 GMT [703] LOG:  incomplete startup packet
2021-04-13 06:02:21.740 GMT [714] LOG:  incomplete startup packet
2021-04-13 06:04:51.723 GMT [917] LOG:  incomplete startup packet
2021-04-13 06:05:01.784 GMT [933] LOG:  incomplete startup packet
2021-04-13 06:08:53.728 GMT [1248] LOG:  incomplete startup packet
2021-04-13 06:08:56.783 GMT [1256] LOG:  incomplete startup packet
2021-04-13 06:15:15.739 GMT [1773] LOG:  incomplete startup packet
2021-04-13 06:15:16.759 GMT [1780] LOG:  incomplete startup packet

patsevanton avatar Apr 13 '21 06:04 patsevanton

kubectl logs -n xxxxx airflow-scheduler-658d5d4454-r2sgl error: a container name must be specified for pod airflow-scheduler-658d5d4454-r2sgl, choose one of: [scheduler scheduler-gc] or one of the init containers: [wait-for-airflow-migrations]

patsevanton avatar Apr 13 '21 06:04 patsevanton

kubectl describe -n xxxxx pod airflow-scheduler-658d5d4454-r2sgl

Name:         airflow-scheduler-658d5d4454-r2sgl
Namespace:    xxxxx
Priority:     0
Node:         ubuntu1804/192.168.22.7
Start Time:   Tue, 13 Apr 2021 05:54:59 +0000
Labels:       component=scheduler
              pod-template-hash=658d5d4454
              release=airflow
              tier=airflow
Annotations:  checksum/airflow-config: d84f720b402097e58a879efc896869845ec8bae56455470bf241221b2a016f19
              checksum/extra-configmaps: 2e44e493035e2f6a255d08f8104087ff10d30aef6f63176f1b18f75f73295598
              checksum/extra-secrets: bb91ef06ddc31c0c5a29973832163d8b0b597812a793ef911d33b622bc9d1655
              checksum/metadata-secret: a954626eab69d09b0c9bfd44128c793948c18d943d9e97431903985654b350c5
              checksum/pgbouncer-config-secret: da52bd1edfe820f0ddfacdebb20a4cc6407d296ee45bcb500a6407e2261a5ba2
              checksum/result-backend-secret: af25d110685219c9219e6a4f9b268566118a4b732de33192387a111d1f241c89
              cluster-autoscaler.kubernetes.io/safe-to-evict: true
Status:       Pending
IP:           10.1.78.6
IPs:
  IP:           10.1.78.6
Controlled By:  ReplicaSet/airflow-scheduler-658d5d4454
Init Containers:
  wait-for-airflow-migrations:
    Container ID:  containerd://ac2a25e781647e59aa341e5e308ebbef60408d69b1a2f6b5f2d83df808718ec2
    Image:         apache/airflow:2.0.0
    Image ID:      docker.io/apache/airflow@sha256:e973fef20d3be5b6ea328d2707ac87b90f680382790d1eb027bd7766699b2409
    Port:          <none>
    Host Port:     <none>
    Args:
      python
      -c
      import airflow
      import logging
      import os
      import time

      from alembic.config import Config
      from alembic.runtime.migration import MigrationContext
      from alembic.script import ScriptDirectory

      from airflow import settings

      package_dir = os.path.abspath(os.path.dirname(airflow.__file__))
      directory = os.path.join(package_dir, 'migrations')
      config = Config(os.path.join(package_dir, 'alembic.ini'))
      config.set_main_option('script_location', directory)
      config.set_main_option('sqlalchemy.url', settings.SQL_ALCHEMY_CONN.replace('%', '%%'))
      script_ = ScriptDirectory.from_config(config)

      timeout=60

      with settings.engine.connect() as connection:
          context = MigrationContext.configure(connection)
          ticker = 0
          while True:
              source_heads = set(script_.get_heads())

              db_heads = set(context.get_current_heads())
              if source_heads == db_heads:
                  break

              if ticker >= timeout:
                  raise TimeoutError("There are still unapplied migrations after {} seconds.".format(ticker))
              ticker += 1
              time.sleep(1)
              logging.info('Waiting for migrations... %s second(s)', ticker)

    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 13 Apr 2021 06:15:15 +0000
      Finished:     Tue, 13 Apr 2021 06:16:24 +0000
    Ready:          False
    Restart Count:  7
    Environment:
      AIRFLOW__CORE__FERNET_KEY:        <set to the key 'fernet-key' in secret 'airflow-fernet-key'>        Optional: false
      AIRFLOW__CORE__SQL_ALCHEMY_CONN:  <set to the key 'connection' in secret 'airflow-airflow-metadata'>  Optional: false
      AIRFLOW_CONN_AIRFLOW_DB:          <set to the key 'connection' in secret 'airflow-airflow-metadata'>  Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from airflow-scheduler-token-q6zfr (ro)
Containers:
  scheduler:
    Container ID:
    Image:         apache/airflow:2.0.0
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Args:
      bash
      -c
      exec airflow scheduler
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Liveness:       exec [python -Wignore -c import os
os.environ['AIRFLOW__CORE__LOGGING_LEVEL'] = 'ERROR'
os.environ['AIRFLOW__LOGGING__LOGGING_LEVEL'] = 'ERROR'

from airflow.jobs.scheduler_job import SchedulerJob
from airflow.utils.db import create_session
from airflow.utils.net import get_hostname
import sys

with create_session() as session:
    job = session.query(SchedulerJob).filter_by(hostname=get_hostname()).order_by(
        SchedulerJob.latest_heartbeat.desc()).limit(1).first()

sys.exit(0 if job.is_alive() else 1)
] delay=10s timeout=5s period=30s #success=1 #failure=10
    Environment:
      AIRFLOW__CORE__FERNET_KEY:        <set to the key 'fernet-key' in secret 'airflow-fernet-key'>        Optional: false
      AIRFLOW__CORE__SQL_ALCHEMY_CONN:  <set to the key 'connection' in secret 'airflow-airflow-metadata'>  Optional: false
      AIRFLOW_CONN_AIRFLOW_DB:          <set to the key 'connection' in secret 'airflow-airflow-metadata'>  Optional: false
    Mounts:
      /opt/airflow/airflow.cfg from config (ro,path="airflow.cfg")
      /opt/airflow/logs from logs (rw)
      /opt/airflow/pod_templates/pod_template_file.yaml from config (ro,path="pod_template_file.yaml")
      /var/run/secrets/kubernetes.io/serviceaccount from airflow-scheduler-token-q6zfr (ro)
  scheduler-gc:
    Container ID:
    Image:         apache/airflow:2.0.0
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Args:
      bash
      /clean-logs
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /opt/airflow/logs from logs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from airflow-scheduler-token-q6zfr (ro)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      airflow-airflow-config
    Optional:  false
  logs:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  airflow-scheduler-token-q6zfr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  airflow-scheduler-token-q6zfr
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  25m                   default-scheduler  Successfully assigned xxxxx/airflow-scheduler-658d5d4454-r2sgl to ubuntu1804
  Normal   Pulling    24m                   kubelet            Pulling image "apache/airflow:2.0.0"
  Normal   Pulled     24m                   kubelet            Successfully pulled image "apache/airflow:2.0.0"
  Normal   Created    17m (x5 over 24m)     kubelet            Created container wait-for-airflow-migrations
  Normal   Started    17m (x5 over 24m)     kubelet            Started container wait-for-airflow-migrations
  Normal   Pulled     17m (x4 over 22m)     kubelet            Container image "apache/airflow:2.0.0" already present on machine
  Warning  BackOff    4m58s (x50 over 21m)  kubelet            Back-off restarting failed container

patsevanton avatar Apr 13 '21 06:04 patsevanton

Same issue with namespace airflow

git clone https://github.com/apache/airflow.git
cd airflow/chart/
helm dependency update
kubectl create namespace airflow
werf helm install --wait --set webserver.defaultUser.password=password,ingress.enabled=true,ingress.hosts[0]=airflow.192.168.22.7.xip.io --namespace airflow airflow ./

patsevanton avatar Apr 13 '21 07:04 patsevanton

@mik-laj @kaxil Please view issue Big thanks!

patsevanton avatar Apr 13 '21 08:04 patsevanton

How set debug in wait-for-airflow-migrations ? Thanks!

patsevanton avatar Apr 13 '21 14:04 patsevanton

Check the logs of run-airflow-migrations container in {{ .Release.Name }}-run-airflow-migrations

wait-for-airflow-migrations just waits for the migration to be run.

https://github.com/apache/airflow/blob/6e31465a30dfd17e2e1409a81600b2e83c910036/chart/templates/migrate-database-job.yaml

kaxil avatar Apr 13 '21 16:04 kaxil

kubectl logs -n airflow airflow-scheduler-78d9ffb5ff-5lw8f wait-for-airflow-migrations

DB_BACKEND=postgresql
DB_HOST=airflow-postgresql.airflow.svc.cluster.local
DB_PORT=5432

[2021-04-14 04:07:12,861] {migration.py:155} INFO - Context impl PostgresqlImpl.
[2021-04-14 04:07:12,862] {migration.py:162} INFO - Will assume transactional DDL.
[2021-04-14 04:07:18,838] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
[2021-04-14 04:07:20,086] {<string>:35} INFO - Waiting for migrations... 1 second(s)
[2021-04-14 04:07:21,090] {<string>:35} INFO - Waiting for migrations... 2 second(s)
[2021-04-14 04:07:22,093] {<string>:35} INFO - Waiting for migrations... 3 second(s)
[2021-04-14 04:07:23,095] {<string>:35} INFO - Waiting for migrations... 4 second(s)
[2021-04-14 04:07:24,097] {<string>:35} INFO - Waiting for migrations... 5 second(s)
[2021-04-14 04:07:25,100] {<string>:35} INFO - Waiting for migrations... 6 second(s)
[2021-04-14 04:07:26,102] {<string>:35} INFO - Waiting for migrations... 7 second(s)
[2021-04-14 04:07:27,104] {<string>:35} INFO - Waiting for migrations... 8 second(s)
[2021-04-14 04:07:28,107] {<string>:35} INFO - Waiting for migrations... 9 second(s)
[2021-04-14 04:07:29,109] {<string>:35} INFO - Waiting for migrations... 10 second(s)
[2021-04-14 04:07:30,111] {<string>:35} INFO - Waiting for migrations... 11 second(s)
[2021-04-14 04:07:31,114] {<string>:35} INFO - Waiting for migrations... 12 second(s)
[2021-04-14 04:07:32,116] {<string>:35} INFO - Waiting for migrations... 13 second(s)
[2021-04-14 04:07:33,118] {<string>:35} INFO - Waiting for migrations... 14 second(s)
[2021-04-14 04:07:34,121] {<string>:35} INFO - Waiting for migrations... 15 second(s)
[2021-04-14 04:07:35,124] {<string>:35} INFO - Waiting for migrations... 16 second(s)
[2021-04-14 04:07:36,126] {<string>:35} INFO - Waiting for migrations... 17 second(s)
[2021-04-14 04:07:37,129] {<string>:35} INFO - Waiting for migrations... 18 second(s)
[2021-04-14 04:07:38,131] {<string>:35} INFO - Waiting for migrations... 19 second(s)
[2021-04-14 04:07:39,134] {<string>:35} INFO - Waiting for migrations... 20 second(s)
[2021-04-14 04:07:40,136] {<string>:35} INFO - Waiting for migrations... 21 second(s)
[2021-04-14 04:07:41,139] {<string>:35} INFO - Waiting for migrations... 22 second(s)
[2021-04-14 04:07:42,141] {<string>:35} INFO - Waiting for migrations... 23 second(s)
[2021-04-14 04:07:43,143] {<string>:35} INFO - Waiting for migrations... 24 second(s)
[2021-04-14 04:07:44,145] {<string>:35} INFO - Waiting for migrations... 25 second(s)
[2021-04-14 04:07:45,148] {<string>:35} INFO - Waiting for migrations... 26 second(s)
[2021-04-14 04:07:46,150] {<string>:35} INFO - Waiting for migrations... 27 second(s)
[2021-04-14 04:07:47,152] {<string>:35} INFO - Waiting for migrations... 28 second(s)
[2021-04-14 04:07:48,154] {<string>:35} INFO - Waiting for migrations... 29 second(s)
[2021-04-14 04:07:49,157] {<string>:35} INFO - Waiting for migrations... 30 second(s)
[2021-04-14 04:07:50,159] {<string>:35} INFO - Waiting for migrations... 31 second(s)
[2021-04-14 04:07:51,161] {<string>:35} INFO - Waiting for migrations... 32 second(s)
[2021-04-14 04:07:52,162] {<string>:35} INFO - Waiting for migrations... 33 second(s)
[2021-04-14 04:07:53,164] {<string>:35} INFO - Waiting for migrations... 34 second(s)
[2021-04-14 04:07:54,166] {<string>:35} INFO - Waiting for migrations... 35 second(s)
[2021-04-14 04:07:55,168] {<string>:35} INFO - Waiting for migrations... 36 second(s)
[2021-04-14 04:07:56,170] {<string>:35} INFO - Waiting for migrations... 37 second(s)
[2021-04-14 04:07:57,172] {<string>:35} INFO - Waiting for migrations... 38 second(s)
[2021-04-14 04:07:58,175] {<string>:35} INFO - Waiting for migrations... 39 second(s)
[2021-04-14 04:07:59,177] {<string>:35} INFO - Waiting for migrations... 40 second(s)
[2021-04-14 04:08:00,180] {<string>:35} INFO - Waiting for migrations... 41 second(s)
[2021-04-14 04:08:01,182] {<string>:35} INFO - Waiting for migrations... 42 second(s)
[2021-04-14 04:08:02,185] {<string>:35} INFO - Waiting for migrations... 43 second(s)
[2021-04-14 04:08:03,187] {<string>:35} INFO - Waiting for migrations... 44 second(s)
[2021-04-14 04:08:04,189] {<string>:35} INFO - Waiting for migrations... 45 second(s)
[2021-04-14 04:08:05,192] {<string>:35} INFO - Waiting for migrations... 46 second(s)
[2021-04-14 04:08:06,194] {<string>:35} INFO - Waiting for migrations... 47 second(s)
[2021-04-14 04:08:07,196] {<string>:35} INFO - Waiting for migrations... 48 second(s)
[2021-04-14 04:08:08,199] {<string>:35} INFO - Waiting for migrations... 49 second(s)
[2021-04-14 04:08:09,201] {<string>:35} INFO - Waiting for migrations... 50 second(s)
[2021-04-14 04:08:10,203] {<string>:35} INFO - Waiting for migrations... 51 second(s)
[2021-04-14 04:08:11,206] {<string>:35} INFO - Waiting for migrations... 52 second(s)
[2021-04-14 04:08:12,208] {<string>:35} INFO - Waiting for migrations... 53 second(s)
[2021-04-14 04:08:13,211] {<string>:35} INFO - Waiting for migrations... 54 second(s)
[2021-04-14 04:08:14,212] {<string>:35} INFO - Waiting for migrations... 55 second(s)
[2021-04-14 04:08:15,215] {<string>:35} INFO - Waiting for migrations... 56 second(s)
[2021-04-14 04:08:16,217] {<string>:35} INFO - Waiting for migrations... 57 second(s)
[2021-04-14 04:08:17,219] {<string>:35} INFO - Waiting for migrations... 58 second(s)
[2021-04-14 04:08:18,222] {<string>:35} INFO - Waiting for migrations... 59 second(s)
[2021-04-14 04:08:19,224] {<string>:35} INFO - Waiting for migrations... 60 second(s)
Traceback (most recent call last):
  File "<string>", line 32, in <module>
TimeoutError: There are still unapplied migrations after 60 seconds.

patsevanton avatar Apr 14 '21 04:04 patsevanton

I faced this problem some days ago. However, I tried install airflow using scripts shown below today and it seems working .

#!/bin/bash -x
rm -rf airflow
git clone https://github.com/apache/airflow.git
cd airflow/chart
helm dependency update
helm install airflow . -n airflow
  • results
$ kubectl get pod
NAME                                READY   STATUS    RESTARTS   AGE
airflow-postgresql-0                1/1     Running   0          88s
airflow-scheduler-db9d85f4d-5nx6j   2/2     Running   0          88s
airflow-statsd-5556dc96bc-c24f5     1/1     Running   0          88s
airflow-webserver-f4f4cb77f-xwgcn   1/1     Running   0          88s
  • Logs of wait-for-airflow-maigrations container
$ kubectl logs -f airflow-scheduler-db9d85f4d-5nx6j wait-for-airflow-migrations
DB_BACKEND=postgresql
DB_HOST=airflow-postgresql.airflow.svc.cluster.local
DB_PORT=5432
........
[2021-04-14 05:16:38,136] {migration.py:155} INFO - Context impl PostgresqlImpl.
[2021-04-14 05:16:38,137] {migration.py:162} INFO - Will assume transactional DDL.
[2021-04-14 05:16:42,390] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
[2021-04-14 05:16:44,341] {<string>:35} INFO - Waiting for migrations... 1 second(s)
[2021-04-14 05:16:45,348] {<string>:35} INFO - Waiting for migrations... 2 second(s)
[2021-04-14 05:16:46,351] {<string>:35} INFO - Waiting for migrations... 3 second(s)
[2021-04-14 05:16:47,355] {<string>:35} INFO - Waiting for migrations... 4 second(s)
[2021-04-14 05:16:48,367] {<string>:35} INFO - Waiting for migrations... 5 second(s)
[2021-04-14 05:16:49,371] {<string>:35} INFO - Waiting for migrations... 6 second(s)
[2021-04-14 05:16:50,375] {<string>:35} INFO - Waiting for migrations... 7 second(s)
[2021-04-14 05:16:51,380] {<string>:35} INFO - Waiting for migrations... 8 second(s)
  • My environemnt

    • OS: ubuntu 20.04
    • Kuberentes: v1.20.2
    • on-premise server
      • cpu: intel xeon
  • Apache Airflow version: master git

    • head commit id: https://github.com/apache/airflow/commit/70c74c1f6867a2f6cdd2f892a40f43aea858572b

gen16k avatar Apr 14 '21 05:04 gen16k

@patsevanton I asked for logs from different container in https://github.com/apache/airflow/issues/15340#issuecomment-818861093 😄 (names are a bit confusing)

Check the logs of run-airflow-migrations (not wait-for-migration) container in {{ .Release.Name }}-run-airflow-migrations

kaxil avatar Apr 15 '21 23:04 kaxil

Now i cannot reproduce. Later please

patsevanton avatar Apr 27 '21 09:04 patsevanton

@kaxil

kubectl logs -n xxxxx airflow-scheduler-0  run-airflow-migrations
error: container run-airflow-migrations is not valid for pod airflow-scheduler-0

kubectl logs -n xxxxx run-airflow-migrations
Error from server (NotFound): pods "run-airflow-migrations" not found

patsevanton avatar May 13 '21 05:05 patsevanton

@kaxil

git clone https://github.com/apache/airflow.git
cd airflow/chart/
helm dependency update
kubectl create namespace apatsev
werf helm install --wait --set webserver.defaultUser.password=password,ingress.enabled=true,ingress.hosts[0]=airflow.192.168.22.8.sslip.io --namespace apatsev airflow ./

pod not found

kubectl logs -n apatsev airflow-run-airflow-migrations
Error from server (NotFound): pods "airflow-run-airflow-migrations" not found
kubectl logs -n apatsev airflow-run-airflow-migrations run-airflow-migrations
Error from server (NotFound): pods "airflow-run-airflow-migrations" not found

https://github.com/apache/airflow/blob/6e31465a30dfd17e2e1409a81600b2e83c910036/chart/templates/migrate-database-job.yaml#L27 is kind of Job.

I dont have job

kubectl get all -A | grep Job
kubectl get all -A | grep job

patsevanton avatar May 18 '21 07:05 patsevanton

FYI, have you tried set "wait" false? I found this works for me: https://forum.astronomer.io/t/run-airflow-migration-and-wait-for-airflow-migrations/1189/10

LiboShen avatar Jun 08 '21 22:06 LiboShen

@LiboShen How add wait false to install I install airflow:

git clone https://github.com/apache/airflow.git
cd airflow/chart/
helm dependency update
kubectl create namespace apatsev
werf helm install --wait --set webserver.defaultUser.password=password,ingress.enabled=true,ingress.hosts[0]=airflow.192.168.22.8.sslip.io --namespace apatsev airflow ./

Create file or add option?

patsevanton avatar Jun 09 '21 03:06 patsevanton

This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.

github-actions[bot] avatar Jul 10 '21 00:07 github-actions[bot]

This issue has been closed because it has not received response from the issue author.

github-actions[bot] avatar Jul 30 '21 00:07 github-actions[bot]

I'm facing the same issue. I don't ever get any pod or job containing run-airflow-migrations, and consequently the wait never ends. Is there a solution for this? I'm not running terraform and neither is patsevanto. He is using werf, I'm using flux.

autarchprinceps avatar Aug 11 '21 15:08 autarchprinceps

tldr;

@autarchprinceps I'm having the same trouble here. When deploying the chart to our local machines all runs fine. But deploying the chart to our cluster using helm does not create the run-airflow-migrations and create-airflow-user job.

You have to set the --wait flag with helm!

roestzwiee avatar Aug 12 '21 15:08 roestzwiee

I also ran into this issue. In the interest of saving time for anyone else that stumbles upon this issue, the fix seems to be setting --wait=false on the Helm command, per @LiboShen's advice.

On Rancher you can un-check "Wait" on the final page before deploying. I'm sure OpenShift and other solutions have similar options for exposing the underlying Helm --wait flag.

I can confirm that this worked on Rancher v2.6.1 installing Airflow to a downstream cluster provisioned by RKE running Kubernetes v1.21.5.

dsykes16 avatar Oct 20 '21 00:10 dsykes16

I'm using ArgoCD to deploy the Helm chart, tearing out my hair trying every possible variation, but I'm also not seeing the run-airflow-migrations pod, it doesn't run or show up. So my webserver and scheduler wait forever for the migrations that never started.

Not sure how to set the --wait=false param using Argo. I tried argocd app set [my-app] --helm-set-string wait=false but doesn't seem to do anything.

So I'm stuck.

yehoshuadimarsky avatar Nov 01 '21 01:11 yehoshuadimarsky

@yehoshuadimarsky Did you find the way to do this? I am also trying to implement the exact same thing and have been struggling to get the issue fixed.

JyotiSnK avatar Nov 17 '21 13:11 JyotiSnK

@yehoshuadimarsky Did you find the way to do this? I am also trying to implement the exact same thing and have been struggling to get the issue fixed.

Yes! I finally got this to work: put this in your values.yaml override:

 ​        # per https://github.com/apache/airflow/pull/16291 
 ​        # and https://github.com/apache/airflow/pull/16331 
 ​        createUserJob: 
 ​          jobAnnotations: 
 ​            "argocd.argoproj.io/hook": Sync 
 ​            "argocd.argoproj.io/sync-wave": "0" 
 ​            "argocd.argoproj.io/hook-delete-policy": BeforeHookCreation,HookSucceeded 
 ​        migrateDatabaseJob: 
 ​          jobAnnotations: 
 ​            "argocd.argoproj.io/hook": Sync 
 ​            "argocd.argoproj.io/sync-wave": "0" 
 ​            "argocd.argoproj.io/hook-delete-policy": BeforeHookCreation,HookSucceeded

yehoshuadimarsky avatar Nov 17 '21 14:11 yehoshuadimarsky

Thanks! I tried adding this annotation in my values.yaml file, but for some reason does not seem to work. When I add this annotation, my application fails with validation error and values.yaml file does not even get loaded in the ArgoCD UI. And it shows the line no. as error where I am adding the annotation. May be I am missing something, not sure. Is there any specific version of charts/argo-cd I am suppose to use to get this working? 2021-11-23_17-35-15

JyotiSnK avatar Nov 25 '21 11:11 JyotiSnK

I'm seeing the same issue when deploying using helm to k8s.

Using versions:

  • chart: 1.3.0
  • app: 2.2.1

It seems that these hooks were recently made configurable: https://github.com/apache/airflow/blob/main/chart/values.yaml#L632

I just ran a quick test and indeed the job is now scheduled/run and after completion the scheduler/webserver pod are spinning.

I don't fully understand why/how it was build this way...?

Paul

paul-bormans avatar Nov 25 '21 16:11 paul-bormans

I'm seeing the same issue when deploying using helm to k8s.

Using versions:

  • chart: 1.3.0
  • app: 2.2.1

It seems that these hooks were recently made configurable: https://github.com/apache/airflow/blob/main/chart/values.yaml#L632

I just ran a quick test and indeed the job is now scheduled/run and after completion the scheduler/webserver pod are spinning.

I don't fully understand why/how it was build this way...?

Paul

Cool, I didn't know this was added recently, this should solve the problem really nicely

yehoshuadimarsky avatar Nov 26 '21 06:11 yehoshuadimarsky

https://github.com/apache/airflow/blob/main/chart/values.yaml#L632

Make sure you put it in the correct parts of the YAML file. I was referring to the jobAnnotations of each of the migrate Jobs, such as here

https://github.com/apache/airflow/blob/cab6d96a463e227961b8e487dd000199f4864978/chart/values.yaml#L614

and here

https://github.com/apache/airflow/blob/cab6d96a463e227961b8e487dd000199f4864978/chart/values.yaml#L647

yehoshuadimarsky avatar Nov 26 '21 06:11 yehoshuadimarsky

Workaround I've used:

  1. set airflow.dbMigrations.runAsJob: True in your values.yaml file ( https://github.com/airflow-helm/charts/blob/e1d49498426add959350ab8efacead8f96400759/charts/airflow/templates/db-migrations/db-migrations-job.yaml#L1 )
  2. disable Helm wait option ( https://github.com/airflow-helm/charts/blob/e1d49498426add959350ab8efacead8f96400759/charts/airflow/values.yaml#L340 )

kreuzert avatar Dec 02 '21 11:12 kreuzert

seeing same issue

bitsofinfo avatar Dec 20 '21 22:12 bitsofinfo