Quick and Easy Apache Airflow Setup Tutorial

VipinVipin
2 min read

Apache Airflow is an open-source platform designed to manage workflows, specifically data pipelines. It was created by Airbnb to handle their increasingly complex workflows and allows users to:

  • Define workflows using Python code. This makes them easier to maintain, collaborate on, and test.

  • Schedule workflows to run at specific times (e.g., daily) or based on events (e.g., a new file being added).

  • Monitor workflows as they run to see their progress and identify any issues.

Hands-on Practice:

Pre-requisite:
Docker Desktop should be installed.

Setup steps:

  1. Create below folder under any directory where you want to set up airflow

  2. Go to
    https://airflow.apache.org/docs/apache-airflow/2.9.0/docker-compose.yaml
    and save the docker-compose file in your airflow setup folder.

  3. Initialize the database

    docker-compose up airflow-init

    After initialization is complete, you should see a message like this:

    The account created has the login airflow and the password airflow

    Once the process is finished successfully, Docker Desktop will pull the following images.

  4. Running airflow

    docker-compose up

    In a second terminal you can check the condition of the containers and make sure that no containers are in an unhealthy condition:

    docker ps

Accessing the environment After starting Airflow, you can interact via a browser using the web interface.

Login with username: airflow and password: airflow

Troubleshooting tips:
\>In case you come across the below error

\>Go to the resource section and increase the memory size

3
Subscribe to my newsletter

Read articles from Vipin directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vipin
Vipin