Enabling the Data-driven Enterprise
While there is a long road ahead before data-driven decision-making becomes common across organizations, we believe businesses that enable this data stack will be an important area for venture capital investment in the years to come.
The term “data-driven” has become a staple of business jargon. Enterprises seek to base everything from corporate strategy to digital advertising to product design on data, leading to the rise of terms like “data-driven decision-making” and the “data-driven enterprise.”
But why are companies so focused on being data-driven? It’s because data, fundamentally, is the digital representation of the world around us and, if appropriately leveraged, can provide critical, quantitative insights into business performance. For example, the simple act of going to a store and buying a pair of shoes gets captured as data (purchase amount, SKU, store location) in a company’s point-of-sale system and eventually becomes a very small, input datapoint into that retailer’s analytics. This means that, rather than rely on anecdotal evidence or “gut feel,” companies can now leverage data to inform their strategy and business decisions.
The ubiquity of data is evident in its absolute volume and in the workforce’s improving data literacy. According to Dell’s 2021 Global Data Protection Index, the average organization manages more than 10 times the amount of data it did five years ago—rising from 1.45 petabytes in 2016 to 14.6 petabytes in 2021.1 For context, a single petabyte is equivalent to approximately 500 billion pages of standard printed text.2 Similarly, data literacy within the workforce is as important as ever, with 85 percent of C-suite executives believing that being data literate will be as vital in the future as the ability to use a computer is today, according to a 2021 survey by Qlik, of more than 1,200 C-level executives.3
However, to achieve a data-driven culture, companies require the appropriate technology and infrastructure to effectively collect, store, and analyze their data. Over the last few years, a reference architecture, known as the modern data stack (explained in more detail below), has emerged to help address the pain points companies experience in their transition to becoming data-driven.
For companies beginning their data-driven journey, the choice of where to start is daunting from a technology perspective. ModernDataStack.xyz, an online database of companies in the data stack ecosystem, currently tracks more than 490 products across 29 different subsectors that make up the modern data stack.
In building a modern data stack, we think it important to step back and focus on the first principles of what it means to be a data-driven enterprise. Our first principles for the modern data stack are access, speed, and trust.
We believe that the first step in building a modern data stack should be asking whether its architecture and tools support the first principles of improving access, speed, and trust across an organization’s data.
The Modern Data Stack
So what exactly comprises a “modern data stack?” At its core, a modern data stack is a product or products that allow a company to: ingest data, store data in a central location, transform data from its original format into analytical-quality data, and analyze the results. Apart from these discrete components, a modern data stack should also have appropriate security and privacy controls, as well as governance from an end-to-end perspective.
Figure 2 provides a high-level overview of the modern data stack from an architectural perspective. The table below that (Figure. 3) goes into additional detail, with examples of companies within each area.
Please see footnotes and disclosures for company and holdings information. FiveTran, Airbyte, Rudderstack, dbt, Matillion, Collibra, Alaton, Atlan, BigID, Satori, Cyber, Cyral, and Privacera are privately held and are not owned in any Sands Capital strategies. BigQuery, Dataform, and Looker Technologies are owned by Google. Tableau is owned by Salesforce, which is not held by any Sands Capital strategies
In defining the modern data stack, it is also important to identify what makes it so “modern.” We believe this comes down to three shared features/characteristics common to these solutions. First, they are cloud-based, managed products. That is, companies do not need to manage the underlying infrastructure or manually “rack-and-stack” servers in a data center to use these products. Second, modern data stack solutions are scalable. They are designed to handle the volume and speed of data in the modern enterprise. Finally, modern data stack solutions are integrated, promoting a more “open”—rather than “closed”—architecture. This enables products across the data stack to easily pull/push information from one another, providing companies flexibility in building their data stack while reducing vendor lock-in.
What’s Next for the Modern Data Stack
Venture capital has been very active in the area of the modern data stack, resulting in a wave of new startups. Based on our analysis of relevant information from PitchBook Data, total capital invested in modern data stack startups across the United States, Canada, and Europe increased from some $8.5 billion in 2020 to $28.6 billion in 2021, an increase of 236 percent on a year-over-year basis.5 We expect that the modern data stack will remain an important area of investment.
As we evaluate the emerging themes within the modern data stack, we revert to our first principles of access, speed, and trust to help guide our thinking. The next sections describe areas we are particularly excited about as the modern data stack matures.
Access: Headless Business Intelligence
A first principle for a data-driven organization is improving access to the analytical data needed for decision-making. Traditionally, business intelligence (BI) products (e.g., Salesforce’s Tableau, Microsoft’s PowerBI, Looker Data Sciences) have been the medium through which businesses access and review their analytical data. However, as workforces become more data literate and data-driven decision-making more ubiquitous, it is difficult to imagine a world in which BI solutions remain the universal consumption layer for analytical data. Analytical data and business metrics can provide enormous value outside dashboards or BI tools, with such data also potentially embedded in data science tools, custom internal applications, or other third-party software-as-a-service (SaaS) applications (e.g., financial planning and analysis software, customer relationship management systems, marketing operations software). In short, as companies become more data-driven, people will need the right data, in the right application, in order to help inform a decision, and we believe this isn’t solved purely with BI and dashboards.
Headless BI products— GoodData, Lightdash, and Tinybird6 —are an emerging category in the modern data stack that help improve data access across an organization. By headless, we refer to an application that does not rely on a front-end presentation layer or user interface, like traditional dashboarding solutions. Rather, headless solutions leverage APIs to share data with a broad range of downstream systems (e.g., internal applications, third-party SaaS solutions, BI tools, etc.). Headless BI solutions effectively act as a centralized staging area for a company’s key metrics and analytics that can then be widely consumed by any application that requires it. An added benefit of headless BI is consistency—both in terms of how metrics are derived (i.e., the transformation logic) and how the underlying data is used—which improves accuracy and trust in an organization’s analytics.
Speed: Augmented Analytics
Current BI and visualization tools are effective at helping businesses understand the “what” behind their data (e.g., what was revenue last month?). Yet, what business analysts really require are quick answers to deeper business questions related to the “why” (e.g., why did revenue drop 10 percent last month?) and the “how” (e.g., how can we increase sales in the Northeast region?). However, there are numerous challenges preventing business analysts from quickly accessing such answers, including: 1) a long backlog of data analysis requests to the analytics team (see Figure 4); 2) lack of the relevant skillset directly within the business line; and 3) an inability to handle the significant data volume required to derive such answers.
As businesses seek to become more data-driven in their decision-making, they require a new set of solutions in the form of augmented analytics that place data exploration and analysis closer to the decision-makers. Augmented analytics solutions—like Tellius, ThoughtSpot, and Sisu Data—bridge the gap between BI dashboards and data science, by enabling nontechnical business users to leverage artificial intelligence and machine learning (AI/ML) and natural language processing technology to rapidly iterate and explore data to surface insights, without relying on the slow, traditional analytics cycle.7 Augmented analytics helps businesses move away from static dashboard reports to a more self-service analytical experience, while leveraging AI/ML to deliver faster insights on underlying trends, correlations, and root-cause analysis within one’s data.
Trust: Next-generation Data Governance
A key benefit of modern data warehouses, such as Snowflake and Google’s BigQuery, is the ease with which they scale, allowing companies to store ever-increasing volumes of data with minimal operational or infrastructure impact on the customer. Combine this with the declining cost of data storage over the last decade, and the incentives have now changed to “default store” rather than “default qualify” data going into a data warehouse. In other words, the opportunity cost of not storing data in a data warehouse and missing out on the future potential insights/analytics it could provide is higher than the physical cost of data storage.
As the sources of data and consumers of data grow within an enterprise, next-generation data governance solutions will become increasingly necessary to ensure that the modern data stack does not become a modern data swamp. Next-gen data governance solutions—like Stemma, Atlan, Acryl Data, Select Star, and Secoda—will be critical to improving transparency and trust across an organization’s data assets. These solutions will help with 1) cataloging/inventorying data; 2) discovering data; 3) tracking data lineage and ownership; and 4) maintaining consistent definitions for data and metrics across an enterprise. The distinguishing feature of these next-gen data governance solutions will be their ability to leverage AI/ML on existing metadata across the data stack to automate more of what has historically been a manual, time-consuming process of governing data.
Opportunities in Data-driven Approaches
We are entering an exciting period of innovation for the modern data stack. There is still a long road ahead before data-driven decision-making becomes commonplace across an organization; however, we believe many companies recognize the value and potential competitive advantage a data-driven approach provides, so they are investing in building these capabilities. We believe this market will continue to be an important area for venture capital investment in the years to come and look forward to the investment opportunities industry-changing companies present.
4 Sands Capital is an investor in WireWheel, a privately held company that makes an intuitive privacy management SaaS platform.
5 Source data from PitchBook Data, which is owned by Morningstar and is not held in any Sands Capital strategy.
6 Cube Dev is the company and core developer behind the open source “analytical API platform” Cube.js. GoodData is a software company that provides data and analytics infrastructure. Lightdash (Telescope Technologies) is an open-source BI alternative. Tinybird is a data analytics company. These companies are private and not held by Sands Capital.
7 Sands Capital is an investor in Tellius. ThoughtSpot is a privately held technology company that produces business intelligence analytics search software. Sisu Data is a privately held company that offers a diagnostic analytics platform for structured data. Neither ThoughtSpot nor Sisu Data are Sands Capital holdings.
A full list of public portfolio holdings, including their purchase dates, is available here. A full list of private holdings is available upon request to qualified investors. Unless otherwise noted, the companies identified represent a subset of current holdings in Sands Capital portfolios and were selected on an objective basis to illustrate examples of the range of companies that enable the data-driven enterprise. They were selected to reflect technology holdings with varied business models and functions within the enterprise. As of 12/31/2021, WireWheel and Tellius were held in the Global Venture strategy. Snowflake was held in the Global Innovation, Select Growth, Global Growth, and Technology Innovators strategies. Google is a subsidiary of Alphabet, which was held in the Global Growth strategy. Amazon was held in the Select Growth, Global Growth, and Technology Innovators strategies. Microsoft was held in the Global Leaders strategy.
The views expressed are the opinion of Sands Capital and are not intended as a forecast, a guarantee of future results, investment recommendations, or an offer to buy or sell any securities.
The views expressed were current as of the date indicated and are subject to change. This material may contain forward-looking statements, which are subject to uncertainty and contingencies outside of Sands Capital’s control. Readers should not place undue reliance upon these forward-looking statements. All investments are subject to market risk, including the possible loss of principal. There is no guarantee that Sands Capital will meet its stated goals. Past performance is not indicative of future results. A company’s fundamentals or earnings growth is no guarantee that its share price will increase. The specific securities identified and described do not represent all of the securities purchased, sold, or recommended for advisory clients. There is no assurance that any securities discussed will remain in the portfolio or that securities sold have not been repurchased. You should not assume that any investment is or will be profitable. Company logos and website images are used for illustrative purposes only and were obtained directly from the company websites. Company logos and website images are trademarks or registered trademarks of their respective owners and use of a logo does not imply any connection between Sands Capital and the company.
References to “we,” “us,” “our, and “Sands Capital” refer collectively to Sands Capital Management, LLC, which provides investment advisory services with respect to Sands Capital’s public market investment strategies, and Sands Capital Ventures, LLC, which provides investment advisory services with respect to Sands Capital’s private market investment strategies. As the context requires, the term “Sands Capital” may refer to such entities individually or collectively. As of October 1, 2021, the firm was redefined to be the combination of Sands Capital Management, LLC and Sands Capital Ventures, LLC. The two investment advisers are combined to be one firm and are doing business as Sands Capital. Sands Capital operates as a distinct business organization, retains discretion over the assets between the two registered investment advisers, and has autonomy over the total investment decision-making process. This document does not constitute or form part of an offer to sell or issue, or a solicitation of an offer to purchase or subscribe for any securities, is not intended to form the basis of any investment decision and is being made available for informational and discussion purposes only. Information contained herein may be based on, or derived from, information provided by third parties. The accuracy of such information has not been independently verified and cannot be guaranteed. The information in this document speaks as of the date of this document or such earlier date as set out herein or as the context may require and may be subject to updating, completion, revision, and amendment. There will be no obligation to update any of the information or correct any inaccuracies contained herein. The distribution of this document, and the offer and sale of interests in an investment fund management by Sands Capital (“Fund”), in certain jurisdictions may be restricted by law. No sale, offer to sell, or solicitation of any offer to buy any interests in a Fund will be made in any jurisdiction in which such offer, sale, or solicitation would be unlawful or to any person to whom it would be unlawful to make such, offer, sale, or solicitation. None of the interests to be issued by a Fund have been or will be registered under the U.S. Securities Act of 1933, as amended (the “Securities Act”), or under the securities laws of, or with any security’s regulatory authority of, any state of other jurisdictions.