apache-airflow-cloudera-parcel icon indicating copy to clipboard operation
apache-airflow-cloudera-parcel copied to clipboard

Parcel for Apache Airflow

Airflow Parcel

This repository allows you to install Apache Airflow as a parcel deployable by Cloudera Manager.

Requirements

  • A supported operating system.
  • MySQL or PostgreSQL database in which to store Airflow metadata.

Currently Supported Versions of Airflow

  • Airflow 1.9.0
  • Airflow 1.10.3

Currently Supported Operating Systems

  • CentOS/RHEL 6 & 7
  • Debian 8
  • Ubuntu 14.04, 16.04, & 18.04

Installing the Parcel

  1. First, install the Airflow CSD. Then you can skip steps #1 and #2.
  2. In Cloudera Manager, go to Hosts -> Parcels -> Configuration.
  3. Add http://archive.clairvoyantsoft.com/airflow/parcels/latest/ to the Remote Parcel Repository URLs if it does not yet exist.
  4. In Cloudera Manager, go to Hosts -> Parcels. Airflow parcels and their respective versions will be available within the Parcels page.
  5. Download, Distribute, Activate the required parcels to use them.

Building the Parcel

  1. Install Docker and Python.
  2. Run the script build_airflow_parcel.sh by executing:
./build_airflow_parcel.sh --airflow <airflow_version> --python <python_version> --parcel <parcel_version>
  1. Output will be placed in the target/ directory.
  2. Use ./serve_parcel.sh to serve this directory via HTTP, or move the entire directory contents to your own webserver.

Resources:

  1. https://github.com/teamclairvoyant/apache-airflow-cloudera-csd