astronomer-cosmos icon indicating copy to clipboard operation
astronomer-cosmos copied to clipboard

Create ExecutionMode.KUBERNETES example DAG & setup CI

Open tatiana opened this issue 1 year ago • 6 comments

To avoid our documentation becoming outdated in incompatible with the latest version of Cosmos, as described in #534, we should:

  • [ ] Create an example DAG, add it to the folder: https://github.com/astronomer/astronomer-cosmos/tree/main/dev/dags

  • [ ] Reference the new example DAG in the docs, using rst code block: https://github.com/astronomer/astronomer-cosmos/blob/main/docs/getting_started/execution-modes.rst?plain=1#L58-L61 https://github.com/astronomer/astronomer-cosmos/blob/main/docs/getting_started/kubernetes.rst

  • [ ] Setup credentials to a K8s cluster that the CI can use to spin up the necessary resources

  • [ ] Update information so the CI runs this example DAG as part of the expensive integration tests: https://github.com/astronomer/astronomer-cosmos/blob/bcf7714bec09caf3a3ffec9c6c2e72bf43e1c7cf/pyproject.toml#L174 https://github.com/astronomer/astronomer-cosmos/blob/bcf7714bec09caf3a3ffec9c6c2e72bf43e1c7cf/pyproject.toml#L183

tatiana avatar Sep 14 '23 10:09 tatiana

@tatiana thanks you! Outdated documentation blocks POC using Cosmos.Looking forward to updated documentation

qimumu9406 avatar Sep 15 '23 03:09 qimumu9406

An option could be to bring the Kubernetes example DAG to be part of the astronomer-cosmos repo and automate the steps we describe in our docs in Github Actions, using: https://github.com/marketplace/actions/kubernetes-kind-cluster

tatiana avatar Sep 27 '23 14:09 tatiana

Hi, @tatiana,

I'm helping the Cosmos team manage their backlog and am marking this issue as stale. The issue involved creating an example DAG for ExecutionMode.KUBERNETES, adding it to a specified folder, referencing it in the documentation, setting up credentials for a K8s cluster for CI, and updating information for the CI to run the example DAG as part of integration tests. It seems that the issue has been resolved by bringing the Kubernetes example DAG to be part of the astronomer-cosmos repo and automating the steps described in the documentation using GitHub Actions.

Could you please confirm if this issue is still relevant to the latest version of the Cosmos repository? If it is, please let the Cosmos team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and cooperation. If you have any further questions or need assistance, feel free to reach out.

dosubot[bot] avatar Mar 09 '24 16:03 dosubot[bot]

This ticket is still relevant - since approximately 30% of Cosmos users use this mode

tatiana avatar May 13 '24 11:05 tatiana

It seems Github actions would allow us to spin up a KinD cluster - so we could automate running the DAG that we mention in our example (but is currently in a separate repo): https://github.com/marketplace/actions/kubernetes-kind-cluster

tatiana avatar May 17 '24 08:05 tatiana

I drafted a PR (https://github.com/astronomer/astronomer-cosmos/issues/535) for this. The automation script works fine locally, but the Postgres instance is not healthy when running in the GitHub Action. CI Job: https://github.com/astronomer/astronomer-cosmos/actions/runs/10151541725/job/28071022045?pr=1127

Debug log:

+ helm list
helm
NAME    	NAMESPACE	REVISION	UPDATED                                	STATUS  	CHART             	APP VERSION
postgres	default  	1       	2024-07-29 20:43:06.773876127 +0000 UTC	deployed	postgresql-15.5.20	16.3.0     
+ echo pod service
pod service
+ kubectl get pods --namespace default
NAME                    READY   STATUS             RESTARTS   AGE
postgres-postgresql-0   0/1     CrashLoopBackOff   3          67s
+ kubectl get svc --namespace default
NAME                     TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE
kubernetes               ClusterIP   10.96.0.1     <none>        443/TCP    101s
postgres-postgresql      ClusterIP   10.96.88.53   <none>        5432/TCP   67s
postgres-postgresql-hl   ClusterIP   None          <none>        5432/TCP   67s
+ echo pg log
+ kubectl logs postgres-postgresql-0 -c postgresql
pg log
postgresql 20:43:56.14 INFO  ==> 
postgresql 20:43:56.15 INFO  ==> Welcome to the Bitnami postgresql container
postgresql 20:43:56.15 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers
postgresql 20:43:56.15 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
postgresql 20:43:56.23 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
postgresql 20:43:56.24 INFO  ==> 
postgresql 20:43:56.34 INFO  ==> ** Starting PostgreSQL setup **
postgresql 20:43:56.44 INFO  ==> Validating settings in POSTGRESQL_* env vars..
postgresql 20:43:56.45 INFO  ==> Loading custom pre-init scripts...
postgresql 20:43:56.54 INFO  ==> Initializing PostgreSQL database...
postgresql 20:43:56.64 INFO  ==> pg_hba.conf file not detected. Generating it...
postgresql 20:43:56.64 INFO  ==> Generating local authentication configuration
+ kubectl describe pod postgres-postgresql-0
Name:         postgres-postgresql-0
Namespace:    default
Priority:     0
Node:         kind-control-plane/172.18.0.3
Start Time:   Mon, 29 Jul 2024 20:43:12 +0000
Labels:       app.kubernetes.io/component=primary
              app.kubernetes.io/instance=postgres
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=postgresql
              app.kubernetes.io/version=16.3.0
              controller-revision-hash=postgres-postgresql-55b4df756d
              helm.sh/chart=postgresql-15.5.20
              statefulset.kubernetes.io/pod-name=postgres-postgresql-0
Annotations:  container.seccomp.security.alpha.kubernetes.io/postgresql: runtime/default
Status:       Running
IP:           10.244.0.6
IPs:
  IP:           10.244.0.6
Controlled By:  StatefulSet/postgres-postgresql
Containers:
  postgresql:
    Container ID:   containerd://a49f8658b13518211e3afc542a4e4c75f735c6b4322d29f1506cb6879286052c
    Image:          docker.io/bitnami/postgresql:16.3.0-debian-12-r23
    Image ID:       docker.io/bitnami/postgresql@sha256:865e[341](https://github.com/astronomer/astronomer-cosmos/actions/runs/10151541725/job/28071022045?pr=1127#step:4:342)baf49006e32b3e72254a15a81c939178cb9c48fcd9faf1c0ac4b49664
    Port:           5432/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 29 Jul 2024 20:43:56 +0000
      Finished:     Mon, 29 Jul 2024 20:43:56 +0000
    Ready:          False
    Restart Count:  3
    Limits:
      cpu:                150m
      ephemeral-storage:  2Gi
      memory:             192Mi
    Requests:
      cpu:                100m
      ephemeral-storage:  50Mi
      memory:             128Mi
    Liveness:             exec [/bin/sh -c exec pg_isready -U "postgres" -h 127.0.0.1 -p 5432] delay=30s timeout=5s period=10s #success=1 #failure=6
    Readiness:            exec [/bin/sh -c -e exec pg_isready -U "postgres" -h 127.0.0.1 -p 5432
[ -f /opt/bitnami/postgresql/tmp/.initialized ] || [ -f /bitnami/postgresql/.initialized ]
] delay=5s timeout=5s period=10s #success=1 #failure=6
    Environment:
      BITNAMI_DEBUG:                        false
      POSTGRESQL_PORT_NUMBER:               5432
      POSTGRESQL_VOLUME_DIR:                /bitnami/postgresql
      PGDATA:                               /bitnami/postgresql/data
      POSTGRES_PASSWORD:                    <set to the key 'postgres-password' in secret 'postgres-postgresql'>  Optional: false
      POSTGRESQL_ENABLE_LDAP:               no
      POSTGRESQL_ENABLE_TLS:                no
      POSTGRESQL_LOG_HOSTNAME:              false
      POSTGRESQL_LOG_CONNECTIONS:           false
      POSTGRESQL_LOG_DISCONNECTIONS:        false
      POSTGRESQL_PGAUDIT_LOG_CATALOG:       off
      POSTGRESQL_CLIENT_MIN_MESSAGES:       error
      POSTGRESQL_SHARED_PRELOAD_LIBRARIES:  pgaudit
    Mounts:
      /bitnami/postgresql from data (rw)
      /dev/shm from dshm (rw)
      /opt/bitnami/postgresql/conf from empty-dir (rw,path="app-conf-dir")
      /opt/bitnami/postgresql/tmp from empty-dir (rw,path="app-tmp-dir")
      /tmp from empty-dir (rw,path="tmp-dir")
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-postgres-postgresql-0
    ReadOnly:   false
  empty-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  dshm:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      Memory
    SizeLimit:   <unset>
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  62s                default-scheduler  Successfully assigned default/postgres-postgresql-0 to kind-control-plane
  Normal   Pulling    62s                kubelet            Pulling image "docker.io/bitnami/postgresql:16.3.0-debian-12-r23"
  Normal   Pulled     57s                kubelet            Successfully pulled image "docker.io/bitnami/postgresql:16.3.0-debian-12-r23" in 5.[400](https://github.com/astronomer/astronomer-cosmos/actions/runs/10151541725/job/28071022045?pr=1127#step:4:401)199843s
  Normal   Pulled     21s (x3 over 53s)  kubelet            Container image "docker.io/bitnami/postgresql:16.3.0-debian-12-r23" already present on machine
  Normal   Created    19s (x4 over 54s)  kubelet            Created container postgresql
  Normal   Started    18s (x4 over 54s)  kubelet            Started container postgresql
  Warning  BackOff    12s (x9 over 52s)  kubelet            Back-off restarting failed container

pankajastro avatar Jul 29 '24 20:07 pankajastro