Quick and Easy Apache Airflow Setup Tutorial
Apache Airflow is an open-source platform designed to manage workflows, specifically data pipelines. It was created by Airbnb to handle their increasingly complex workflows and allows users to:
Define workflows using Python code. This makes them easier to maintain, collaborate on, and test.
Schedule workflows to run at specific times (e.g., daily) or based on events (e.g., a new file being added).
Monitor workflows as they run to see their progress and identify any issues.
Hands-on Practice:
Pre-requisite:
Docker Desktop should be installed.
Setup steps:
Create below folder under any directory where you want to set up airflow
Go to
https://airflow.apache.org/docs/apache-airflow/2.9.0/docker-compose.yaml
and save the docker-compose file in your airflow setup folder.Initialize the database
docker-compose up airflow-init
After initialization is complete, you should see a message like this:
The account created has the login airflow and the password airflow
Once the process is finished successfully, Docker Desktop will pull the following images.
Running airflow
docker-compose up
In a second terminal you can check the condition of the containers and make sure that no containers are in an unhealthy condition:
docker ps
Accessing the environment After starting Airflow, you can interact via a browser using the web interface.
Login with username: airflow and password: airflow
Troubleshooting tips:
\>In case you come across the below error
\>Go to the resource section and increase the memory size
Subscribe to my newsletter
Read articles from Vipin directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by