What we shipped #13
Our users spoke, we listened. We have now completely re-architected the backend of the data catalog.
Kaustav Mitra
Jul 30, 2024
·
2
min read
Introduction
We're middle of the week already and it's time to ship out a few more things the team's been working on. As you know, last week we released Paradime Radar, our real-time intelligence app to measure and optimize analytics work. We also released power-ups and updates to our development experience.
This week we are turning our attention to documentation and CI/CD.
Data Catalog Upgrade
We admit, for quite sometime, it's been a bit of a challenge to use the data catalog in Paradime. The search results quality would be pretty bad for the most basic searches. We had a lot of custom code to improve search results, which was adding latency but not improving search quality. It would fire off way too many metadata queries. The schema tree on the left panel was tiny and very hard to use for medium to large projects, and the list goes on. Overtime, various people had written some straight-out bad, under-performing code. Our humble apologies to our users. It was definitely time for us to act.
So... we decided to simplify!
We have completely re-architected the backend of the data catalog - from the ground up.
Redesigned schema panel
Inspired by the fluidity of apps like Notion, we got rid of the left panel completely and replaced it with a new one. In the new left panel, we provide a view of dbt™ models and data warehouse schema. In addition, we now show all your data products coming from Paradime integrations in the tree. The tree can be extended, collapsed; there is user feedback with loaders, and it remembers your last open state.
Redesigned search and filter
The previous UI was not intuitive and needed a lot of "clicking around". We have gotten rid of all of that and replaced it with a simple, single-line UI where you can search and filter in a single flow.
Search engine upgrade
The biggest change by far has been our improvement in search results quality and relevancy. We decided to show data products that are relevant only to the dbt™ project connected to a repo. In every workspace, users now see data products linked to that specific workspace only.
While doing this, we deleted all the custom code that was sitting on top of our Postgres instance that provided relevancy. We replaced that with Meilisearch engine, a lightning-fast, rust-based, and hyper-relevant search engine. This gave us a 4-10x orders of magnitude performance boost. We took the open-source version of Meilisearch and incorporated that into our cluster for security and low-latency search experience.
In the process, we also removed a lot of unwanted screen flickers that would make the user experience quite frustrating.
Easier lineage
And finally, we introduced better lineage analysis in the catalog, including full-screen viewing. Previously, to view the lineage, users would only see one node depth upstream and downstream. Because of that, the lineage could become very hard to see and users would have to move away from the context. Now users can see arbitrary node-depths in the lineage. For users to remain in the flow, we have also introduced a full-screen lineage view so users can stay in the context.
What's next?
On the catalog, we have a pretty packed roadmap. A lot more integrations, natural language search, further usability improvements, and lots more. Watch this space - or get in touch with us and we'll tell you all about it.