Connecting dbt™ and Redshift can unlock a significant boost for data teams, enabling efficient data transformations, model creation, & quality tests.
Amazon Redshift is a fully managed, petabyte-scale cloud data warehouse. It uses a massively parallel processing (MPP) architecture for fast execution of complex SQL queries. Performance is optimized with columnar storage and advanced compression, enabling efficient analysis of diverse data sources, including data lakes, operational databases, and streaming data, with minimal ETL.
Redshift seamlessly integrates with AWS services, provides robust security and fine-grained access control, and utilizes machine learning for enhanced query optimization and predictive analytics.
This combination empowers organizations with efficient data handling, optimized performance, and better decision-making capabilities.
To start, you’ll have to create a profiles.yml file in your dbt™ project. This file contains the necessary credentials and connection details for Redshift. There are two primary methods for authentication: username/password and IAM roles.
[Find code for copy-pasting in the code appendix]
For username and password authentication, use the following structure in profiles.yml
1my_redshift_project:
2 target: dev
3 outputs:
4 dev:
5 type: redshift
6 host: your-redshift-cluster.example.com
7 user: your_username
8 pass: your_password
9 port: 5439
10 dbname: your_database
11 schema: your_schema
12 threads: 4
For enhanced security, many data teams prefer using IAM roles. This method eliminates the need to store credentials in the configuration file
When using IAM roles, ensure that the appropriate permissions are set up in AWS to allow dbt™ to assume the role and access Redshift.
1my_redshift_project:
2 target: dev
3 outputs:
4 dev:
5 type: redshift
6 method: iam
7 cluster_id: your_cluster_id
8 host: your-redshift-cluster.example.com
9 port: 5439
10 dbname: your_database
11 schema: your_schema
12 threads: 4
13 iam_duration_seconds: 3600
Here's how to create a dedicated Redshift user for Paradime:
1. Create a new user group and user in Redshift
1-- create group
2create group transformer;
3
4-- create user and assign to group
5create user paradime_prod_user
6 password '<generate_password>'
7 in group transformer;
2. Grant necessary privileges to the user group
1-- grant privileges to the user group
2grant select on all tables in schema information_schema to group transformer;
3grant select on all tables in schema pg_catalog to group transformer;
4
5-- for each schema:
6grant usage on schema <schema_name> to group transformer;
7grant select on all tables in schema <schema_name> to group transformer;
8
9-- grant privileges to user group to create in database
10grant create on database <your_database> to group transformer;
3. Maintain privileges after each dbt™ run: After each production run, ensure that the pardime_prod_user has the correct privileges to read newly created schemas. This step is crucial as Redshift doesn't support granting privileges on future schemas automatically.
4. Add Production Connection in Paradime: Navigate to 'account settings > connections'. Select the Production Environment and enter the required fields using the credentials created for the paradime_prod_user.
For more help, check out our Help Docs.
After configuring the profiles.yml file and setting up the Redshift user, analytics engineers can test the connection using the dbt debug command. This command verifies that dbt™ can successfully connect to Redshift and identifies any configuration issues.
Paradime's product-based pricing offers a refreshing alternative to the unpredictable costs of e.g. dbt™ Labs' consumption-based model. Here’s why we stands out:
It’s time for you to finally experience financial peace of mind - and Paradime is making that happen for you 😏
Paradime's got your back for everything dbt™ and Redshift. Here's why we're crushing it:
How are we doing it?
Ready to leave dbt Cloud™ in the dust? Hit us up for a chat.
Let's skyrocket your analytics game together! 🚀 🙌