The Paradime product update #6. We shipped working with prod data during dbt development and injecting runtime variables into materialized tables.
Welcome to another edition of what we shipped. This one is a bit special because this will be our last shipment for Paradime v2.0. Yep, you heard it right - Paradime v3.0 is coming very, very soon 🚀 🤩.
Most analytics engineers work with development data that is stale and even worse, they don't know about it 😱. Let's rewind that a bit. Organizations using dbt™*, run production jobs that update their production data warehouse continuously throughout the day. During those runs, the prod warehouse has the freshest schema and freshest data.
Regarding building dbt™* models during development, analyst use the development schema and dev warehouse. The dev warehouse typically only has partial data or schema. Some orgs refresh the dev warehouse nightly with partial or full replication of prod to keep dev current. In some other orgs, analysts run a black box script to copy prod to dev before they start work every morning. Whatever the scenario, users always run into the problem that during working during the day, the dev schema falls behind prod.
When you are working in a large team with 20 or more analytics engineers, commits happen throughout the day and there are some pipelines that run every 10 minutes. So every 10 minutes, dev can fall behind prod.
This disconnect between prod and dev leads to the following consequences:
Currently, there are a few ways one can go about this with varying degrees of success:
At Paradime, we want to provide our customers with the ultimate flexibility in getting their stuff done. Our customers push the boundaries of what is possible. They don't have time and they want to challenge the status quo.
So, our customers, when they run their production dbt™* jobs on Paradime, can now in a single dbt™* command, develop continuously using production data and production schema as can be seen from the graphic below:
Paradime modified --defer --schedule-name | Analytics Engineering using dbt™
Paradime customers just need to add the --defer --schedule-name=<bolt-schedule-name> modifier at the end of their dbt™* run command and we will take care of fetching and replacing local manifest.json. We will do this with zero overhead for the customer while improving the development experience by orders of magnitude.
Currently, this is a fairly advanced feature. In a next iteration, we are going to make this even more user-friendly, so that platform leaders can roll this out to their entire analytics team with zero training or learning overhead.
So if you move your dbt™* production jobs to the Paradime Bolt scheduler, this feature can be used straightaway.
In Paradime Bolt, we now expose a set of runtime variables that can be injected as separate columns into materialized tables through the dbt™* Jinja-SQL. The variables are:
These variables can be referenced like standard environment variables in dbt™* model SQL.
That's all for now as we prepare for Paradime v3.0 and we hope you will benefit from this latest product release. We will be back again with a few more updates. In the meantime, have a great rest of the week - keep building and getting stuff done.
Sign up for a FREE 14-day trial or schedule some time with our team to learn more about Paradime 💪