From manual to magical: Smarter warehouse organization with dbt macro overrides

Register To Watch This Content

This Talk has ended, but you can still watch the replay! Register now to get access.

By submitting you agree to the Terms & Privacy Policy

From manual to magical: Smarter warehouse organization with dbt macro overrides

This talk will showcase a method for dbt users to improve the organization, data discovery, and tidiness of the models in their project while overall reducing the types and costs of conflicts that arise as a project scales in magnitude with constantly increasing numbers of models and contributors over time. The types of issues of scale that this talk will discuss include model naming collisions, naming conventions that try to get around the limitations of a single namespace, long and hard-to-debug project configurations files, and more. This talk will introduce a possible solution to this madness in an easily accessible method--macro overrides.

--

As a dbt project (and a data warehouse) grows, eventually an organization will outgrow the single namespace available with dbt model names, also known as node names. The tool provides one method to expand the available namespace via custom schema, database, and table alias configurations at the node level or at the folder-level, and multiple locations to be able to do so. At some level of scale, however, you may end up with either a 1000 line-long config file that is hard to parse, or 1000 custom schema and database overrides that are hard to track down in individual model files.

Additionally, the dbt models directory and your data warehouse may become cluttered and disorganized with models becoming hard to find, users confused about how a model in the dbt docs ends up in a particular schema, and poor discoverability of data overall.

And last but not least, the ability to name only a *single model* with a particular common table name that you might reasonably need to reuse across business areas is very limiting and forces you to adopt poor naming conventions or even more complicating table alias configurations to get around the requirement for unique node names.

This talk will introduce a deterministic table materialization pattern via overrides to the dbt default macros "generate_database_name", "generate_schema_name", and "generate_alias_name" which will allow dbt users to automate exactly where a model ends up in the data warehouse with custom names for all three layers of the namespace using only the node name itself. With this pattern users will no longer be limited to just one table called "dim_customer"--you can have a "dim_customer" in every schema in your warehouse if you so choose without having to alias each model individually. In this talk we'll also discuss some patterns for project organization to go along with the macro automations that will improve data discoverability within the data warehouse as well as the tidiness of the dbt project itself.

08 May 2025, 06:00 PM

Keynote Stage

06:00 PM - 06:30 PM

About The Speaker

Mariah Rogers

Mariah Rogers

Senior Analytics Engineer, Arcadia

An accomplished dbt project hacker, making dbt project configurations bend to her will since 2019.

Secoda

The unified data governance platform

Main Sponsor

Want to sponsor this event? Contact Us