Open and extensible DataOps management
A core part of our DataOps platform, Databand’s open source library enables you to track data quality information, monitor pipeline health, and automate advanced DataOps processes. We keep our library open source to provide users control over how data is tracked and build custom extensions for any requirement.

Data quality monitoring
Run health checks on your data lake and database tables, like S3, Snowflake, and Redshift. Built using Databand’s open source library, which makes it easy to report data quality and performance metrics to Databand’s monitoring system or your local logging system.
Gain out of the box metrics for tracking data freshness, accuracy, and completeness.
Customize data trackers according to your most important data quality checks
Instantly setup data health tracking in Airflow DAGs, Spark jobs, and your data warehouse
Pipeline logging and metrics tracking
Integrate Databand into pipelines to report metrics about your data quality and job performance.
Automatically generate data profiling and statistics on data files and tables.
Define and report any custom metric about your data every time your pipeline runs.
Track workflow inputs and outputs and lineage of data across tasks and broader pipelines.
Automation tools
Create advanced automation for data pipeline management and MLOps, including pipeline testing, model deployment, and retraining.
Abstract out configurations to compute environments and data locations so its easier to test, deploy, and iterate
Run different versions of pipelines based on changing data inputs, parameters, or model scores
Execute pipelines easily across large Spark or Kubernetes clusters
def buy_vegetables(veg_list)
from store import veg_store
return veg_store.purchase(veg_list)
@task
def cut(vegetables):
chopped = []
for veg in vegetables:
chopped.append(veg.dice())
return [x + "\n" for x in chopped]
def add_dressing(chopped_vegetables, dressing, salt_amount="low"):
for veg in chopped_vegetables:
veg.season(salt_amount)
return chopped_vegetables
def prepare_salad(vegetables_list=data_repo.vegetables, dressing="oil"):
vegetables = buy_vegetables(vegetables_list)
chopped = cut(vegetables)
dressed = add_dressing(chopped, dressing)
return dressed
with DAG(dag_id="prepare_salad") as dag:
salad = prepare_salad()
""" CLI:
airflow backfill -s 2020-06-01 -e 2020-06-02 prepare_salad
"""
def buy_vegetables(veg_list)
from store import veg_store
return veg_store.purchase(veg_list)
@task
def cut(vegetables):
chopped = []
for veg in vegetables:
chopped.append(veg.dice())
return [x + "\n" for x in chopped]
def add_dressing(chopped_vegetables, dressing, salt_amount="low"):
for veg in chopped_vegetables:
veg.season(salt_amount)
return chopped_vegetables
def prepare_salad(vegetables_list=data_repo.vegetables, dressing="oil"):
vegetables = buy_vegetables(vegetables_list)
chopped = cut(vegetables)
dressed = add_dressing(chopped, dressing)
return dressed
""" CLI:
dbnd run prepare_salad
"""
Contributions to the community
We’ve benefited greatly from the work of other developers and we want to share the love. These are some recent contributions we’ve made to the community.
githubRecent contributions
Scheduler Optimizations
Fix data issues fast
See how Databand can transform data observability at your organization today.