Movie Challenge Rewind: Data Modeling Movie Success

Discover the 1st place winner's insights and best practices from the Movie Data Modeling Challenge!

July 10, 2024
A reading icon
4
 min read
Movie Challenge Rewind: Data Modeling Movie Success

Welcome to the "Movie Challenge Highlight Reel" series 🙌

This blog series will showcase the "best of" submissions from Paradime and Lightdash's Movie Data Modeling Challenges, highlighting the remarkable data professionals behind them.

If you're unfamiliar with the Movie Data Modeling Challenge, enrich your series experience by exploring these essential resources: the challenge introduction video and the winner's announcement blog. They offer valuable background information to help you fully appreciate the insights shared in this series.

In each "Movie Challenge Highlight Reel" blog, you'll discover:

  • Key Movie insights: Uncover the valuable insights participants derived from historical Movie datasets, revealing scroll-stopping insights about movies, actors, directors, production companies, finances, and more.
  • Analytics Engineering best practices: Learn about the participants' approach to project execution, from initial analysis to final insights, including their coding techniques (SQL, dbt™) in Paradime.

Now let's check out our first installment, exploring Isin Pesch and her submission!

Introduction

Hey there! My name is Isin Pesch, and I'm a data analytics engineer at Deel. I recently competed in Paradime's Movie Data Modeling Challenge, and I'm proud to say I took first place!

In this blog, I'll start by sharing a few insights I uncovered, and then I'll dive into how I built my project and how I used Paradime to make it all happen.

Insights Uncovered

Below are three of my favorite insights I uncovered, but you can check out the rest in my Github readme.md file.

Top 10 movies by Total Combined Success

Top 10 movies by Total Combined Success | dbt | paradime.io

Approach: I built int_movies_mapping.sql to aggregate individual success metrics like revenue, Rotten Tomatoes rating, IMDb votes, and major awards. I then used int_combined_movie_success.sql to normalize these metrics and combine them into a single success rating.

Most successful actor-director duos

Most successful actor-director duos | dbt | paradime.io

Approach: Similar to the insight above, I leveraged the combined success metric to identify the best actor-director pairs (minimum two movies together) in actor_director_success.sql.

Change in Movie Success Over the Years

Change in Movie Success Over the Years | dbt | paradime.io

Approach: With the combined success metric as my starting point, I grouped all movies by release year and calculated their average combined success by year.

Building my project

To build my project, I began by thoroughly understanding the provided datasets in Snowflake. My initial focus was identifying the primary metric for my dashboard, which centered around the concept of success. I then assessed the data quality, which took significantly more time than anticipated. I did simple quality checks in my staging layer, and solved more complex data issues in the intermediate layer.

With this foundation, I visualized the final structure and key insights needed for my dashboard, which you can see below in my data lineage.

Building my project | data lineage | dbt | paradime.io

Of course, my project game plan wasn't perfectly linear. I ran into several roadblocks and mishaps along the way, but I tried to stay as disciplined as possible to reach the challenge deadline.

How I used Paradime

This was my first time using Paradime, and the learning curve was almost non-existent; I was up and running in minutes. It has all the features I have come to expect from cloud-hosted dbt™ platforms. One feature I found particularly useful throughout development was Paradime's natively-supported code linter, SQL fluff.

Within my .sqlfluff file, I defined rules for formatting and coding conventions. For instance, I ensured all SQL keywords were lowercase and established consistent coloring rules. With a single click, I could apply these rules to my entire model, automatically fixing any violations. This feature streamlined my workflow and ensured my code remained clean and readable, contributing to the project's success.

Wrap Up

Participating in this challenge was definitely worth my time and energy. Diving into movie franchises and seeing the profitability and ratings trends was super interesting, and it helped me improve my analytics engineering capabilities.

I highly recommend signing up for Paradime’s next dbt™ data modeling challenge.

Schedule a call with the team and learn how to maximize the impact of analytics

Interested to learn more?
Try out the free 14-days trial
Close Cookie Preference Manager
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts. More info
Strictly Necessary (Always Active)
Cookies required to enable basic website functionality.
Oops! Something went wrong while submitting the form.