How to Monitor Airflow with Prometheus, StatsD, and Grafana

Get the best practices written by data experts that will help you and your data engineering team to:

  • Monitor your Airflow pipelines better
  • Configure an open-source observability dashboard
  • Discover how to trust your data
screenshot of grafana operational dashboard

What's inside

Monitoring Airflow can be painful. To debug health problems or find the root cause of failures, a data engineer needs to hop between the Apache Airflow UI, DAG logs, various monitoring tools, and Python code.

It doesn’t have to be this way.

You can use operational dashboards to get a bird’s-eye view of our system, clusters and overall health.

In this guide, we’ll be exploring the best practices for going the open-source route to building an operational dashboard.

This guide’s goal is to quickly answer questions like:

  • Is our cluster alive?
  • How many DAGs do we have in a bag?
  • Which operators succeeded and which failed lately?
  • How many tasks are running right now?
  • How long did it take for the DAG to complete?

Keep up with the Databand community