In this second part, we will go through graph selectors in the dbt™ commands that analytics engineers can use to fine tune their model selections.
In the first part of this series we looked at the anatomy of a dbt™ CLI command and the various commands available today.
In this follow up article we will dive deeper into Graph Operators. We will kick off with an explanation of the graph operators available today and then show advanced uses.
dbt™ CLI’s graph operators help you navigate your data transformation graph. If you remember, under the hood, dbt™ traverses through all your models and builds the execution graph or DAG of your models. With graph operators you target specific parts of your dbt™ project or execution graph.
Graph operators in dbt™ are special syntax used with the --select
flag to select subsets of your project's graph. They're like secret codes to tell dbt™ exactly which models you want to work with.
The wildcard is your "grab everything" operator. This runs all models in my_schema
.
dbt run --select my_schema.*
No special character needed. Just use the path!
dbt run --select models/staging
This runs all models in the 'staging' directory.
The plus before a model name selects the model and its parents.
dbt run --select +final_model
This runs final_model
and all models it depends on.
If you want to get specific about how many generations up or down you go then just add a number to the parent operator in the format +<number>
like the examples:
# selects the model "black_sheep", it's parents and grandparents
dbt run --select +2black_sheep
The plus after a model name selects the model and its children.
dbt run --select parent_model+
This runs parent_model
and all models that depend on it.
Like the parent operator, you can also select generations in the child operator using the format of +<number>
following the model name like
# selects the model "black_sheep", it's children and grandchildren
dbt run --select black_sheep+2
The '@' operator is all about selecting parents or children, without the original model. It's like saying "everyone invited to the party except you!”
dbt run --select @model_name
This runs all parents of model_name
, but not model_name
itself.
dbt run --select model_name@
This runs all children of model_name
, but not model_name
itself.
This runs model1, model2, and model3.
dbt run --select model1,model2,model3
Use multiple selectors to get their intersection.
dbt run --select tag:nightly,staging.*
This runs models that are both tagged 'nightly' and in the 'staging' directory.
Combine operators for laser-focused selection:
dbt run --select tag:nightly,+final_model
This runs nightly-tagged models that final_model
depends on.
Use dbt ls
to preview your selection:
dbt run --select tag:nightly,+final_model
This lists all models in the staging directory without running them.
Refresh two generations of parents and all children of critical models:
dbt run --select +2tag:critical+
Test everything related to final reports except the reports themselves:
dbt test --select @tag:final_report@
There you have it, folks! With these graph operators in your toolkit, you're ready to navigate your dbt project like a pro. Mix and match to create powerful, precise model selections.