awesome-dbt
awesome-dbt copied to clipboard
A curated list of awesome dbt resources
Awesome dbt

Welcome to the awesome curated list of dbt resources!
Any kind of contribution is greatly encouraged and appreciated. For making a contribution, please check the contribution guidelines first! Add new entries on the top of sections (LIFO) to keep fresh items more visible! Also, feel free to add new sections.
Happy contributing!
Contents
- Get Started
- How To
- Integrations
- User Stories
- Data Quality
- CI/CD
- Orchestration
- Utilities
- Packages
- Community
- Sample Projects
- Contributors
Get Started
Courses from where you can get started with Analytics Engineering.
- dbt in a real world scenario, A Beginner dbt tutorial - A beginner tutorial to understand dbt with a real world example.
- Analytics Engineering Glossary - Living collection of terms & concepts commonly used in the data industry by dbt Labs.
- Zero to Hero dbt - Complete course covering both theory & practice through real-world Airbnb use-case.
- Data Engineering Zoomcamp - Data engineering course on cutting edge tools including dbt.
- Analytics Engineering with dbt - Paid course offered by co:rise covering the basics of dbt.
- dbt Fundamentals - Official free course offered by dbt. Excellent for learning the basics of dbt Cloud.
- Refactoring SQL for Modularity - Another dbt labs offered free course on dbt refactoring and CTE supercharging.
- Learn DBT from Scratch - Guides you through a setup paired with Snowflake (decorated with extras).
How To
Helping hand on setting up integrations and implementing best practices.
- Dry running our data warehouse using BigQuery and dbt - Use dbt & BigQuery dry run jobs to validate our 1000+ models in under 30 seconds.
- Automatically generate ERD - Automatically generate ERDs and display in your docs site.
- Business Intelligence Standards - Best practices in Business Intelligence standards for integrating with dbt.
- Jinja cheatsheet - Jinja cheatsheet for dbt development.
- Test SQL Pipelines against Production Clones using DBT and Snowflake - Leverage Snowflake Zero-copy-clones to run slim ci checks.
- How we structure our dbt projects - How the dbt team structures its dbt projects.
- dbt guide - Primer on how you should properly set up and configure your dbt workflow.
- dbt for Data Transformation – Hands-on - Yet another tutorial for using dbt Cloud.
- Start Modeling Data - Configuring Bigquery with your dbt project.
- Accelerating Data Teams with dbt & Snowflake - A dbt & Snowflake workshop on financial data.
- Creating a dev environment quickly on Snowflake - Setting up teh integraton with Snowflake.
- How to set up a dbt data-ops workflow, using dbt cloud and Snowflake - Leverage GitHub Actions to set up CI/CD with dbt Core.
- How to configure your dbt repository - Mono-repo or not mono-repo?
- Best Practices for Optimizing Your dbt and Snowflake Deployment - Pocket guide on optimization best practices with Snowflake.
- How to Deploy dbt to Production using GitHub Actions
- Doing More With Less: Using DBT to load data from AWS S3 to Snowflake via External Tables - An alternative guide to set up your dbt-external-tables workflow.
- Best Practices for your dbt Style Guide - Standards for well organized base layer with Airbyte ingestion.
- Tips and Tricks about working with dbt - Tips from community members.
- Writing Unit Tests for dbt - An overview about the package dbt-unit-testing.
Integrations
Collection of known data integrations with dbt
- Cube - APIs, Caching, and Access Control on top of dbt Metrics.
- FlexIt Analytics - Business Intelligence platform with deep dbt Cloud and CLI integration.
- Raycast - Raycast integration to monitor dbt Cloud Jobs.
- Metaplane - Data Observaibility layer on top of your dbt + BI project.
- Dbt + Machine Learning: What makes a great baton pass? - Landscape of ML utilities around dbt.
- Soda - Integration of Soda's data observability platform and dbt.
- Supported Adapters - Offically supported database adapters.
- Lightdash - Open source Looker alternative with deep dbt integration.
- Superset - Open source visualization layer for your Modern Data Stack.
- Dagster and dbt: Better Together - Getting started with the dagster-dbt library.
- fal - Add multi-language support (Python) to your dbt project.
- Privacy Dynamics - Anonymize data in your dbt project.
User Stories
Use-cases and user stories implemented by the community members using components of the MDS with dbt.
- How HomeToGo connected dbt and Superset to make metadata more accessible and reduce analytical overhead - A dbt<>Superset connector that leverages Superset's API capabilities and dbt's manifest.
- Self-service Business Intelligence - Eliminate the need for a data modeling semantic layer in BI.
- Leveraging DBT as a Data Modeling tool - Reflection on one-year usage of dbt.
- dbt + Materialize: Streaming to a dbt project near you - How to own your real-time transformation workflows like batch-based alternatives.
- Who's really using dbt? - Behind the community of analytics engineers.
- dbt and the Analytics Engineer — what's the hype about - Behind the upheaval of the analytics engineer profession.
- Analyzing Fishtown's dbt project performance with artifacts - Using project artifacts to identify anomalies and room for refactoring.
- Deploying and Running dbt on Azure Container Instances - Demonstration of integration with Azure.
- Beware of DBT Incremental Updates Against Snowflake External Tables - Things you should be aware of when using external tables with dbt.
- dbt development at Vimeo - Best practises from the Vimeo Data team.
Data Quality
Best-practices and extensions of the testing framework.
- PipeRider - PipeRider allows you to define the shape of your data once, and then use the data checking functionality to alert you to changes in your data quality.
- Elementary - A dbt package that provides data anomaly detection as dbt tests.
- Environment-dependent Unit Testing in dbt - Guide on how to run unit tests in dbt dynamically.
- dbt-expectations - Port between dbt and great_expectations to extend out-of-the-box tests.
- re_data - A dbt package for montioring metrics and detect anomalies.
- How do you test your data - Suggestions on testing your data powered by the community.
- How to unit test sql transforms in dbt - Unit test using source defer and generic custom tests.
CI/CD
Make the best out of your product quality and seamless delivery.
- Slim CI/CD with Bitbucket Pipelines - How to setup slim CI on Bitbucket.
- dbt-docs-to-notion - A GitHub action for exporting dbt model docs to a Notion database.
- Anatomy of A Pipeline: CI/CD For a dbt Data Warehouse on Google Big Query Using Azure Pipelines - Setting up CI/CD for a Big Query Stack using Azure Pipelines.
- Continuous Integration and Automated Build Testing with dbtCloud - Great and detailed blogpost on setting up Slim CI in dbt Cloud.
- How to review an analytics pull request - Checkpoints to consider when reviewing an analytics engineer PR.
- Performing a blue/green deploy of your dbt project on Snowflake - A very tidy and fail-safe way to run dbt in production by using two parallel production enviromnents.
- How we speed up our CI runs by 10x using Slim CI - Limit data in long-running CI checks to improve developing experience.
Orchestration
Resources to manage and maintain dependencies in modern data pipelines.
- Building a Scalable Analytics Architecture with Airflow and dbt - Leveraging the dbt manifest in Airflow.
- Auto-generating an Airflow DAG using the dbt manifest - Yet another article on extracting value from the manifest file.
- Building a robust data pipeline with the dAG stack: dbt, Airflow, Great Expectations - Demonstration of a data orchestration project with Airflow.
- Run dbt in Azure Data Factory - Primer about dbt on Azure Data Stack.
Utilities
Useful tools and extensions to bump up your analytics engineer worklow.
- pytest-dbt-core - Pytest dbt core is a pytest plugin for testing your dbt projects.
- looker-gen - Generate lookml from dbt.
- dbtenv - A version manager for dbt.
- sqlfmt - This tool formats your dbt SQL code so you don't have to.
- SQLFluff - SQL linter that supports dbt and Jinja templating.
- Build Data Access Layer on dbt - Package to build GraphQL API on top of your dbt project.
- Run changed models based on Git status - Handy bash function to run changed models since last commit.
- How we set up our computers for working on dbt projects - Things I wish I would have known when started working with dbt. Tools and hacks to improve developing experience.
- fzf-dbt - Search dbt models interactively from terminal.
- vscode-dbt-power-user - VSCode extension to give more clarity on model dependencies.
- Your Essential dbt Project Checklist - Checklist on items necessary for a successful dbt project.
- dbt Style Guide - Developing styleguide often referred in PR templates.
- Clean your warehouse of old and deprecated models - Clean out warehouse models which are not existent in the project.
- dbt-tips - Excellent companion to your dbt practice with rich collection of tips.
- dbt-tags - Understanding the scopes of dbt tags.
- Pre-commit hooks - Pre-commit hooks for checking data integity before schema change commit.
Packages
Community-developed packages to extend default macros and toolset.
- dbt-yaml-check - Checks that columns defined in YAML also exist in SQL.
- data-diff - A command-line tool and Python library to efficiently diff rows across two different databases.
- dbt-project-evaluator - This package highlights areas of a dbt project that are misaligned with dbt Labs' best practices.
- dbt_constraints - Generate database constraints based on the tests in a dbt project.
- dbt-date - Date logic and calendar functionality.
- dbt-privacy - Macros to make it easier to protect your customers' data.
- dbt-fivetran-utils - General macros and helpers.
- dbt_metrics - Macros to support secondary calculations and generate business metrics.
- dbt-metabase - Model synchronization from dbt to Metabase.
- dbt-coves - CLI tool for generating a scaffold for your dbt project.
- dbt-profiler - Data profiling and doc block generator.
- dbt_utils - General macros library. A must have.
- dbt_audit_helper - Macros for data audits that compare columns values and schemas between tables.
- dbt-ml-preprocessing - A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
- dbt-external-tables - Macros to stage your external sources.
- dbt-feature-store - Macros to build a feature store right within your dbt project.
- dbt-codegen - Macros that generate dbt code, and log it to the command line.
- dbt-init - Create a project and populate as much of the dbt project as possible.
- dbt-artifacts - This package builds a mart of tables from dbt artifacts loaded into a table.
- dbt-erdiagram-generator - This packages generate ERD diagrams from a dbt project.
- Terraform-dbt Cloud Module - IAC in dbt Cloud via Terraform.
- dbt2looker - Generate Looker views for dbt models.
- dbt-coverage - Checks dbt docs and tests coverage.
- dbt-meta-testing - Yet another coverage testing.
- dbt-superset-lineage - Push and pull metadata between dbt to Superset.
- dbtvault - Package for generating and executing ETL for Data Vault 2.0 on Snowflake.
- dbt-invoke - CLI for creating, updating, and deleting dbt property files.
- dbt-unit-testing - Package which contains macros to support unit testing.
Community
Conferences, meetups, dicussions, newsletters, podcasts, etc. led by fellow analytics engineers and forums of contact.
- Locally Optimistic - A Slack community of aspiring analytics leaders discussing and sharing lessons learned and challenges from their experiences in using data.
- DataTalks.Club - Global online community of data enthusiasts. Podcasts and blogs, etc. are distributed with high frequency.
- Metadata Weekly - Weekly substack about metadata, the metrics layer and MDS.
- Data & Analytics Events in 2022 - Great curated list of upcoming data analytics conferences.
- Data Council Austin 2022 - Worldwide community driven analytics conference with a handful of talks fitting to the dbt stack.
- Discourse v2 - Revamped and ported hub of main discussions for the community.
- Coalesce 2021 - Second iteration of the analytics engineer conference.
- Coalesce 2020 - Annual dbt conference full of fascinating use-cases.
- dbt meetups - List of community led dbt meetups.
- Analytics Engineer Roundup - Official dbt Labs newsletter on topics of the MDS.
- Benn Stacil's Newsletter - Tought-provoking reads from founder of Mode.
- Data Engineering Weekly - Weekly newsletter of recent trends in Data Engineering.
- Data Engineering Podcast - One of the most popular data engineering podcasts covering great concepts and new products.
- Analyitics Engineer Podcast - Official podcast of dbt Labs.
- dbt Slack - Energy-filled hub of analytics engineers (Highly recommended).
- r/dataengineering - Subreddit of data engineering topics.
- Drill to Detail Podcast - Special guests discussing big data, business intelligence, modern data stack.
Sample Projects
Sample projects which work out-of-the box. Reflect use-cases publicly available.
- GitLab Data Team - Gitlab's open source dbt project.
- attribution-playbook - A worked example to demonstrate how to model customer attribution.
- mrr-playbook - A worked example to demonstrate how to model subscription revenue.
- Use dbt inside Visual Studio Code development containers - Set up your dbt environment with pre-installed extensions.
- dag-stack - Dbt-Airflow-GreatExpectations Stack.
- Jaffle Shop - A self-contained dbt project for testing purposes.
- Spotify User Analytics - Sample dbt project with Spotify user data.
- dbt-github-workflow - Deploy BigQuery + Airflow.
- airflow-dbt-demo - Demonstration of Airflow integration.
Contributors
Thanks for all the great resources! Can't see your avatar? Check the contribution guide on how you can submit your resources to the community!
