NBA Challenge Rewind: On the Court with dbt™: Data Modeling the Toronto Raptors
Discover Nikita's insights, data modeling best practices, and his experiences in Paradime's 'NBA Data Modeling Challenge.'
Nikita Volynets
Jun 13, 2024
·
5
min read
Welcome to the "NBA Challenge Rewind" series 🙌
This blog series will showcase the “best of” submissions from Paradime’s NBA Data Modeling Challenges, highlighting the remarkable data professionals behind them.
If you’re unfamiliar with the NBA Data Modeling Challenge, enrich your series experience by exploring these essential resources: the challenge introduction video and the winner’s announcement blog. They offer valuable background information to help you fully appreciate the insights shared in this series.
In each "NBA Challenge Rewind" blog, you’ll discover:
Key NBA insights: Uncover the valuable insights participants derived from historical NBA datasets, revealing hidden stories within the game.
Analytics Engineering best practices: Learn about the participants' approach to project execution, from initial analysis to final insights, including their coding techniques (SQL, dbt™) and the innovative use of tools (Paradime, Snowflake, data visualization).
A Personal Touch: Get to know the motivations, backgrounds, and personal narratives of the analytics professionals who bring the NBA data to life.
A personal invitation to Paradime's next challenge: We're moving from the basketball court to the cinema—get your popcorn ready! 🍿
Let's check out our next installment, exploring Nikita Volynets and his submission!
Nikita's path to the challenge
Hi there! I'm Nikita, currently enjoying life in the stunning city of Vancouver, Canada. With 8 years of experience in data analysis and engineering across various industries, I'm always on the lookout for the latest tools and best practices. Recently, I was in search of a challenge that would let me dive deep into the "modern data stack" through hands-on experience. Luckily, I discovered the NBA Data Modeling Challenge on LinkedIn, offering exactly the multi-tool exploration I wanted.
This challenge lived up to my expectations beautifully. I successfully built a comprehensive dbt project with Paradime, getting acquainted with its advanced features like data lineage, data catalog, and production schedules. Being a huge NBA enthusiast, delving into historical NBA data was particularly exhilarating for me. But let's move beyond introductions. I'm excited to share the journey of building my project and the intriguing NBA insights I discovered!
Toolkit for success
The challenge required leveraging Snowflake for data warehousing, Sigma for data visualizations, and Paradime for data modeling (dbt transformations). I hadn’t used Paradime before, but its code IDE, quite similar to VS Code which I’ve been using for several years, made the adaptation smoother. After building out dbt transformations with Paradime, I used Hex to further my data analysis! This blend of tools not only facilitated my project's development but also enriched my understanding of integrating various data technologies seamlessly.
Building my project
Here’s a glimpse into how I turned raw data into strategic insights:
Crafting the game plan
I kicked off the project by immersing myself in Snowflake's extensive NBA datasets. This initial exploration was critical, laying the groundwork for all subsequent transformations and insights. As the old saying goes, “Measure twice, cut once.” As an avid Toronto Raptors fan, I aimed to uncover insights that would shine a light on the team’s performance, areas of opportunity, and how they stack up against the competition.
Execution and core resources
With a clear strategy in place, I moved to the execution phase, starting with the creation of a minimum viable product (MVP) in Snowflake. This approach allowed for the iterative development of SQL transformations in Paradime, visualization of data insights through Sigma, and deeper analysis with Hex. The analysis was anchored by three key datasets:
Game statistics through stg_games_sql, providing a detailed history of NBA games.
Financial dynamics captured by stg_teams_spend_by_season, shedding light on team spending patterns.
Team Comparisons facilitated by stg_teams, enabling a benchmarking analysis against other teams.
Now let’s dive into some of the insights uncovered!
Insights uncovered
To explore all of my insights, please visit my GitHub repo, along with my Hex Dashboard, and Sigma for a comprehensive view. Below are some of my personal favorites:
Toronto Raptors performance vs. Top-10 NBA teams
This analysis identifies key areas where the Toronto Raptors could improve to potentially break into the top ten teams in the NBA.
Insight: The analysis reveals that to potentially rank among the top-10 teams, the Toronto Raptors have various areas for improvement. Their assists per game are 9.6% lower and their field goal percentage is 5.2% lower than the average of the top-10 teams.
Approach:
By leveraging stg_games.sql and stg_teams_spend_by_season.sql, I isolated the Toronto Raptors game stats and spending history over the last three seasons in ftm_raptors_3y_avg_metrics.sql.
Following a similar process, I isolated the top ten teams (based on win rate) over the last three seasons and their corresponding statistics and spending history in tfm_team_top_10_3y_avg_metrics.sql.
Finally, I joined the two tables together in m_raptors_vs_top_10_comparison.sql to derive insights.
To better understand how I transformed these tables to reach my insights, here’s my data lineage from Paradime:
Toronto Raptors performance dashboard
Comprehensive view of team performance by season.
Insight: At the point of this analysis, the Toronto Raptors have played 46 regular season games this season (2023-24) and currently rank 25th among the 30 NBA teams. Their win rate has decreased by 15% from the previous season, with significant drops in key metrics: Free Throw percentage has fallen by 4%, Steals per Game by 2.0, and Blocks per Game by 0.3. Compared to the league's top teams, the Raptors are underperforming in nearly all areas, except Assists per Game. To improve their rankings, they need to focus on enhancing their Three-Point and Free Throw percentages, along with Blocks per Game.
Approach: By leveraging stg_games.sql and stg_teams.sql, I aggregated key game metrics by each team, season, and team type (playoffs and regular season) in tfm_team_grouped_stats.sql. Finally, I performed various analyses in Hex and displayed the results in Sigma.
Where to go from here
Participating in the NBA Data Modeling Challenge was incredibly rewarding. It allowed me to apply my existing skills and knowledge of modern data tools, while also learning new tools and best practices that will advance my career.
For inquiries or feedback about my project, feel free to connect with me on LinkedIn!
Looking forward, Paradime’s got something exciting on the horizon: a challenge centered around Movies data in April. It’s a shift from basketball to the big screen, and honestly, I can’t wait to see what we can uncover within movie datasets. There’s something special about diving into the numbers behind the stories we love on screen. So, if you’ve got a knack for data and love movies, this is your chance to explore, learn, and compete for the $500, $1,000, and $1,500 prizes!