In this third part, we will go through selector methods in the dbt™ commands that analytics engineers can use to fine tune their model selections.
In the final article in this series, we will dive into the world of selector methods for even more precision and flexibility in model selection during data transformations.
Selector methods in dbt™ allow you to filter resources based on specific properties using the method:value
syntax giving you the power to target exactly what you need.
Most selector methods support unix-style wildcards. Here's a quick rundown:
*
: Matches any number of characters (including none)?
: Matches any single character[abc]
: Matches one character listed in the bracket[a-z]
: Matches one character from the specified range in the bracketExample:
dbt list --select "*.folder_name.*"
dbt list --select "model_[a-z].sql"
Use tag:
to select models with a specific tag.
# Run all models with the 'hourly' tag
dbt run --select "tag:hourly"
2. Source SelectorUse source:
to select models that reference a specified source.
# Runs all models that reference the fivetran source
dbt run --select "source:fivetran+"
Use resource_type:
to select nodes of a specific type.
# Runs all models and tasks related to exposures
dbt run --select "resource_type:exposure"
# Lists all tests in your project
dbt list --select "resource_type:test"
Use path:
to select models/sources defined at or under a specific path.
# Runs all models in the "models/marts" path
dbt run --select "path:models/marts"
# Runs a specific model, "customers.sql", in the "models/marts" path
dbt run --select "path:models/marts/customers.sql"
Use file:
to select a model by filename:
# Runs the model defined in 'model_name.sql'
dbt run --select "file:model_name.sql"
# Note: Adding the file extension is optional
dbt run --select "file:model_name"
Use fqn:
to select nodes based on their fully qualified name:
# Runs the model named 'example_model'
dbt run --select "fqn:example_model"
# Runs 'example_model' in 'example_path' within 'project_name'
dbt run --select "fqn:project_name.example_path.example_model"
Use package:
to select models defined within the root project or an installed dbt™ package.
# Runs all models in the 'fivetran' package
dbt run --select "package:fivetran"
# Note: Adding "package" prefix is optional
dbt run --select "fivetran"
dbt run --select "fivetran.*"
8. Config SelectorUse config:
to select models that match a specified node config.
# Runs all models that are materialized as tables
dbt run --select "config.materialized:table"
# Runs all models clustered by 'zip_code'
dbt run --select "config.cluster_by:zip_code"
Use test_type:
to select tests based on type.
# Runs all generic tests
dbt test --select "test_type:generic"
# Runs all singular tests
dbt test --select "test_type:singular"
# Run all unit tests
dbt test --select "test_type:unit"
# Run all data tests
dbt test --select "test_type:data"
Use test_name:
to select tests based on the name of the test defined.
# Runs all instances of the 'not null' test
dbt test --select "test_name:not_null"
Use state:
to select nodes by comparing them against a previous version of the project.
This is a pretty big topic in itself with many variants. For an in-depth understanding check the Paradime docs on state selector. It’s pretty intense 😀.
# Run all tests on new models + and new tests on old models
dbt test --select "state:new" --state path/to/artifacts
# Run all models that have been modified
dbt run --select "state:modified" --state path/to/artifacts
Use exposure:
to select the parent resources of an exposure.
# Tests all models that feed into the monthly_reports exposure
dbt test --select "exposure:monthly_reports"
# Runs all upstream resources of all exposures
dbt run --select "+exposure:*"
Use metric:
to select parent resources of a metric.
# Runs all upstream resources of the monthly_qualified_leads metric
dbt run --select "+metric:monthly_qualified_leads"
Use result:
to select resources based on their results status from a previous execution.
# Runs all models that successfully ran on the previous execution of dbt run
dbt run --select "result:success" --state path/to/project/artifacts
# Runs all tests that issued warnings on the previous execution of dbt test
dbt test --select "result:warn" --state /path/to/project/artifacts
# Runs all seeds that failed on the previous execution of dbt seed
dbt seed --select "result:fail" --state /path/to/project/artifacts
Use source_status:
to select based on the freshness of sources.
dbt source freshness
dbt build --select "source_status:fresher+" --state path/to/prod/artifacts
Use group:
to select models defined within a specified group.
# Runs all models that belong to the marketing group
dbt run --select "group:marketing"
Use access:
to select models based on their access property.
# List all public models
dbt list --select "access:public"
Use version:
to select versioned models.
# Lists versions older than the 'latest' version
dbt list --select "version:old"
# Lists the 'latest' version
dbt list --select "version:latest"
Use semantic_model:
to select semantic models.
# Lists all the semantic model named "sales" and its dependencies
dbt ls --select "semantic_model:sales"
Use saved_query:
to select saved queries.
# Lists all saved queries
dbt list --select "saved_query:*"
Use unit_test:
to select dbt™ unit tests.
# List all unit tests
dbt list --select "unit_test:*"
Combine Selectors: Mix and match for precision targeting.
dbt run --select "tag:nightly,config.materialized:table"
Use Graph Operators: Combine with +
and @
for complex selections.
dbt run --select "source:raw_data+,@tag:critical"
Exclude with Negation: Use --exclude
to exclude certain models.
dbt run --select "path:models/mart" --exclude "tag:deprecated"
Default Method: If you omit the method, dbt™ will default to one of path, file, or fqn.
Combining selector functions with graph operators, analytics engineers can create complex dbt™ commands that execute exactly the models as needed. In this section, we share with you advanced use cases that you can use in your own dbt™ projects.
dbt run --select "tag:nightly,+final_model" --exclude "staging.excluded_model+"
What's happening here?
tag:nightly
: Selects all models tagged with 'nightly'+final_model
: Adds final_model
and all its upstream dependencies--exclude "staging.excluded_model+
": Excludes excluded_model
and all its downstream dependenciesUse case: Run nightly models and critical path, but skip a problematic staging model and its dependents.
dbt run --select staging.* intermediate.* +analytics.critical_metric
What's the magic here?
staging.*
: All models in the staging directoryintermediate.*
: All models in the intermediate directory+analytics.critical_metric
: The critical metric model and its dependenciesUse case: Refresh all staging and intermediate models, ensuring a critical metric is up-to-date.
dbt run --exclude tag:hourly tag:weekly
Let's decode:
tag:hourly
: Excludes models tagged as hourly
tag:weekly
: Excludes models tagged for weekly
runsUse case: Run everything except hourly models and weekly-only models.
dbt run --select tag:critical,tag:nightly,+final_dashboard
What's intersecting? Selects models that are:
final_dashboard
Use case: Run critical nightly models that affect the final dashboard.
dbt run --models path:models/staging/core path:models/mart/finance+
Path perfection:
path:models/staging/core
: All models in the core staging pathpath:models/mart/finance+
: Finance mart models and their childrenUse case: Refresh core staging and propagate changes through finance models.
There you have it - these advanced graph operator techniques will let you slice and dice your dbt™ project with precision. You will be able to execute exactly the models you want in your production dbt™ pipelines.