What Is The Data Graph?
The Data Graph transforms mounds of machine data into relatable everyday “things” that you want to ask questions about and then links them together.
This post is one of many in a series that attempts to explain basic concepts and terminology around Observe, and “The Observability Cloud.” Topics will range from architecture to deeper technical dives into topics like Temporal Algebra, Schema-On-Demand, and more.
The Observability Cloud is based on an entirely new architecture that changes how you ingest, store, analyze, and visualize observability data. Comprised of three components; the Data Lake to unify telemetry, the Data Graph to map and link relevant Datasets, and Data Apps to make observability more turnkey from your favorite services.
Let’s look more closely at The Data Graph and how it helps the Observability Cloud deliver on these promises.
Transforming Data Into Insights
The Data Graph transforms mounds of machine data — logs, metrics, and traces and beyond — into relatable everyday “things” that you want to ask questions about and then links them together. This allows you to quickly navigate massive amounts of data to extract insights rather than digging through individual events, looking for the proverbial needle in the haystack.
Whether it’s pods, shopping carts, customers, or load balancer logs, the Data Graph allows you to visually explore connections found in your data quickly and easily for a more holistic view of your environment — and its health.
The Data Graph allows you to answer the question of “why” something broke, rather than simply “what” broke, by mapping relationships and dependencies between various resources and events and then providing context to help you ultimately understand the root cause.
Specifically answer questions like, “How were my customers affected by the latest push to production?”, ”Why did the payment service go down at 3 am?”, or “Do we need to increase our node count after deploying the latest feature?” The Data Graph frees you from having to manually correlate events and resources, ultimately leading to faster — and smarter — troubleshooting.
Exploring Your Data
Though the Data Graph largely works behind the scenes — with no knowledge of how it works required — there are features like the Dataset Graph that let you visually explore relationships amongst your data. To date, the Dataset Graph provides three different ways to view your Datasets: Links, Lineage, and Focus.
Using the Links view, users can visually explore connections found between Datasets and optionally displays the status of each Dataset, such as whether the Dataset is currently receiving data or not.
The Lineage view (above) acts as a dependency map, that shows how each Dataset is derived, with the source Datasets on the left, and their destination Datasets on the right.
Lastly, the Focus view displays the currently selected Dataset and links to and from it. The Data Graph makes it easy to link Datasets together to create a relational database of all your data, which can derive insights and understand the relationships between different data points.
Reduce Your Investigation Times
The Data Graph is driven by underlying data found in the Data Lake — the single destination for all of your observability data — and how the two pieces interact is crucial for how we help users expedite their investigations. Aside from giving logical context to help visualize relationships in your environment, Observe also lets users “accelerate” data out of the Data Lake into the Data Graph to make it faster to query.
When a Dataset is accelerated, the Dataset is optimized for the specific query patterns used on that Dataset. Accelerated data can be accessed much more quickly than the raw data. So when navigating your environment with the Data Graph, and you want to drill down into something specific, having accelerated data in the relevant dataset will expedite any searches.
Building Your Data Graph
Data Graph is a feature unique to Observe — it makes sense of vast quantities of observability data and reduces troubleshooting time by bringing you relevant context quickly. To get those benefits you have to first get data into the Data Lake.
To do this, Observe provides dozens of Apps and integrations to make collecting telemetry — logs, metrics and traces — from your favorite services a breeze. And because open-sourced plugins power Observe’s Apps and integrations, this means we can handle any type of data, in any format, all while avoiding vendor lock-in.
Using both AWS and Kubernetes? Deploy both Apps and the effect will compound, making all the relevant connections between the Datasets packaged in those Apps. Or, install apps like the Salesforce App to see how, and when, issues in your environment impact your customers.
In our next post, we’ll explore in more detail how Data Apps work, what out-of-the-box content they provide to make observability turnkey, and their role in the Observability Cloud.
If you’d like to learn more about the Observability Cloud, check out the entire series here.
Or, if you’re ready to see how the Observability Cloud will change how you ingest, store, analyze, and visualize your observability data, click here to get access today!