Real estate investing has quietly undergone a massive shift in recent years as the production and distribution of CRE data has proliferated. As a market once defined by local knowledge, most real estate capital was similarly invested locally. When institutional capital began allocating aggressively to real estate, they needed formalized data to make informed investment decisions. What was once an inefficient market operating via relationships, gut intuition, and information asymmetries has become far more efficient as larger pools of capital created a market for the collection and distribution of CRE data.
CoStar, Real Capital Analytics and Trepp have all created successful businesses collecting and organizing the messy, opaque and disparate data that was atomized across owners, brokerages, and esoteric public records. For large investors, those platforms are now table stakes for generating alpha in commercial real estate.
Today, private markets like CRE are becoming more like traditional capital markets where informational edges are increasingly competed away. Legacy players like CoStar who had to invest heavily to build their data sets risk obsolescence as data collection gets simultaneously cheaper and automated.The market for CRE data is evolving to serve more sophisticated capital who demand differentiated insights versus simply an aggregation of data points. In today's letter we will survey the evolving CRE data market and provide a framework for understanding that evolution, including:
How the CRE data ecosystem mirrors the financial data value chain's evolution through three critical stages: production, distribution, and activation
Why the most significant value creation in CRE data will accrue to firms that excel at either production or activation rather than distribution
How sophisticated data and analytics are fundamentally reshaping allocation, risk management, and operational strategies in commercial real estate
As pension funds, insurance companies, and global investment managers began allocating billions to CRE, they brought expectations shaped by their experiences in more transparent markets. These sophisticated players couldn't rely on the golf-course conversations and back-of-napkin analyses that had served the industry for generations. They needed standardized metrics, clean data sets, and analytical frameworks that would allow them to evaluate opportunities with the same rigor they applied to stocks and bonds.
Early real estate data providers sought to solve fundamental problems: tracking property ownership and transactions, monitoring rental rates and occupancy levels, and providing the basic building blocks of market analysis. Yet even as these pioneers brought unprecedented transparency to parts of the market, vast swaths of commercial real estate data remained fragmented, inconsistent, and difficult to access.
Today, data collection and cleaning are far more scalable than in the past with the emergence of AI. Ryan Smith of ProfitIQ, a developer of quantitative analytics for the multi-family sector, points out that nearly all apartments for rent are now listed on rental portals. Web scrapers can continuously monitor these postings and, over time, build reasonably accurate rent rolls. AI can clean and cross reference this with publicly available data sets.
Activities that once required floors of researchers and analysts are now performed continuously by AI agents. Alternative data like credit card spending, footfall metrics, and energy consumption are all becoming more accessible. The traditional information advantages that once defined success in commercial real estate are eroding rapidly, replaced by a new paradigm where sophisticated data and analytics create different types of competitive edges. In short, the raw data is quickly becoming a commodity.
In a future where data collection and distribution is coming down the cost curve, value begins to accrue to the firms who are best at “activating” the data -- using the expanding availability of data to create new intellectual property or analytic capabilities from it. Here again, the CRE market is continuing down a path blazed by traditional financial markets.
The Financial Data Value Chain: Lessons from Financial Markets
The evolution of financial data markets provides a powerful lens for understanding where the commercial real estate data ecosystem is headed. Financial markets underwent a remarkably similar transformation, shifting from relationship-driven trading floors to algorithm-dominated electronic markets over the course of a few decades. This transition created enormous value, spawning billion-dollar data and analytics businesses while fundamentally changing how capital is allocated.
At the heart of this transformation is what The Terminalist calls the "invisible curve" framework—a model that divides the data ecosystem into three distinct stages, each creating value in different ways. The first stage, production, encompasses the creation of raw data at its source. In financial markets, exchanges like the NYSE and Nasdaq generate the primary building blocks: transaction data, order book information, and market activity. These producers literally own the source of truth—the record of what actually happened in the market.
The second stage, distribution, involves the aggregation, standardization, and delivery of data to end users. Bloomberg famously built its empire here, solving the massive coordination problem of collecting data from hundreds of disparate sources and making it accessible through a unified interface. FactSet and Thomson Reuters similarly excel at taking raw, often inconsistent data and transforming it into standardized formats that users can easily consume and compare.
The final stage, activation, is where raw and standardized data transforms into actionable intelligence. Here, firms like MSCI create indexes, risk models, and analytics that help investors make better decisions. Quantitative hedge funds deploy proprietary algorithms to extract trading signals from market data. The value comes not from the data itself but from the intellectual property that turns information into insight.
The invisible curve reveals that the greatest economic value tends to concentrate at the ends—production and activation—rather than in the middle. Data producers own unique, often irreplaceable sources of information, while activators create proprietary intellectual property that transforms commodity data into differentiated insights. The distributors in the middle, while essential to the ecosystem, face constant pressure from new entrants and alternative delivery models.
Mapping this framework onto CRE, brokers, lenders and municipalities are producers of data, while aggregation platforms like CoStar and Real Capital Analytics are distributors. Thus far, activation has been focused on operator applications like RealPage and VTS. These platforms transformed from leasing management tools to data powerhouses by aggregating information across thousands of properties. The company's real-time insights into leasing activity, tenant demand, and market sentiment provide a level of visibility that wasn't possible before the platform existed. By owning both the production of this valuable data and the activation through sophisticated analytics they capture the two most valuable ends of the “power curve.”
While operator-focused platforms have successfully captured value by combining production and activation, they face increasing regulatory scrutiny and antitrust concerns as their market power grows. The investment market, by contrast, offers greater scalability, higher margins, and fewer regulatory headwinds. Investment decisions control trillions in capital allocation, creating enormous willingness to pay for informational advantages that lead to even marginal outperformance.
Moving from Backward Looking to Forward Looking CRE Data
The proliferation of data sources is transforming CRE from a backward-looking, transaction-based market to a forward-looking, insight-driven ecosystem that more closely resembles other sophisticated capital markets. The most advanced investors no longer rely solely on historical sales and rent data to make decisions; they integrate dozens of alternative data sources to develop a multidimensional view of properties and markets.
Consider how this evolution has played out in the retail sector. Twenty years ago, the best information available to retail investors might have been last quarter's sales per square foot reported by tenants. Ten years ago, they could access mobile location data showing overall foot traffic to the property. Today, they can combine location intelligence, credit card transaction data, social media sentiment, and even computer vision analysis of parking lots to develop a real-time, forward-looking view of a center's health and trajectory.
This data revolution is fundamentally changing what's possible in commercial real estate investment and operations. Decisions that were once made annually or quarterly based on backward-looking information can now be made continuously based on real-time insights. Analyses that once focused on a handful of key metrics can now incorporate dozens of factors to develop a more nuanced understanding of value and risk. Strategies that once depended on broad market trends can now be tailored to the specific characteristics and performance drivers of individual assets.
Perhaps the most profound impact has been on decision-making processes that once relied heavily on intuition and experience. Consider the traditional approach to site selection for a new retail development. Twenty years ago, a developer might have driven the area, checked traffic counts at key intersections, reviewed basic demographic information, and relied heavily on their feel for the location. Today, that same developer can access detailed mobile device data showing exactly how many people pass the site, where they come from, how long they stay in the area, and what other locations they visit. They can analyze spending patterns from credit card data, review social media activity in the surrounding community, and use sophisticated models to predict how the area is likely to evolve over time.
This shift from intuition to evidence doesn't eliminate the value of experience. The best investors still need to apply judgment and local knowledge to their decisions. But it does democratize access to information, creating opportunities for newer players to compete effectively with established firms. It also enables more sophisticated approaches to risk management, allowing investors to identify and quantify potential issues that might have gone unnoticed in a less data-rich environment.
Superior information and analytics can help investors identify mispriced assets or markets, positioning them ahead of broader capital flows. Alternative data sources may provide leading indicators of market shifts, allowing forward-thinking firms to anticipate changes before they're reflected in traditional metrics. Proprietary data sets and analytical capabilities create barriers to entry and sustainable edges that are difficult for competitors to replicate.
Institutional investors can now analyze their exposures to specific markets, tenant types, lease durations, and economic drivers with unprecedented granularity. They can stress-test portfolios against various scenarios, from interest rate changes to pandemic-like disruptions, and adjust their holdings accordingly. They can identify correlations and diversification opportunities that might not be obvious from traditional property type and geographic classifications.
This data-driven transformation is still in its early stages, with adoption varying widely across the industry. The largest institutional investors and most sophisticated operators have embraced advanced analytics, often building internal capabilities and proprietary systems. Mid-sized firms typically rely on a combination of third-party platforms and targeted internal initiatives. Smaller players may still operate largely as they have for decades, though the democratization of basic market data has at least leveled the playing field somewhat.
Analytics as Alpha
These emerging patterns offer a roadmap for investors, operators, and entrepreneurs seeking to position themselves for success in the data-driven future of the industry. The traditional challenges of commercial real estate - illiquidity, opacity, and valuation uncertainty - are precisely the problems that sophisticated data models are now equipped to solve.
By leveraging machine learning algorithms that continuously ingest multiple data streams, investors can develop dynamic valuation models that update in real-time rather than relying on quarterly appraisals or limited comparable sales. These models can incorporate not just traditional metrics like cap rates and NOI, but also alternative signals from foot traffic patterns, social media sentiment, satellite imagery, and IoT sensor networks to provide more accurate, nuanced views of asset values.
The impact on market liquidity could be profound. As valuation confidence increases through data-driven models that reduce information asymmetry, transaction velocity naturally accelerates. Investment firms that deploy these advanced analytics gain the ability to identify mispriced assets more quickly, make faster acquisition decisions with greater conviction, and manage risk more effectively across diverse portfolios. Additionally, data-driven valuation standardization creates the foundation for securitization innovations and secondary market trading platforms that could dramatically enhance CRE liquidity.
Unlike the operator market where data applications primarily drive incremental operational improvements, the investment space offers the transformative potential to fundamentally restructure how commercial real estate is valued, traded, and allocated in institutional portfolios. This strategic focus on investment applications represents the natural evolution of CRE data itself, which has steadily progressed from fragmented local knowledge toward increasingly sophisticated market intelligence.
The development of proprietary data assets represents perhaps the most significant opportunity. The firms that will create the most value in the next generation of CRE data will be those that own unique, difficult-to-replicate sources of information that provide fundamental insights into property performance and market dynamics. Companies deploying IoT sensors across large portfolios aren't just optimizing their operations—they're generating proprietary data streams that could become valuable assets in their own right. Platforms that facilitate leasing, property management, or tenant engagement are creating unique visibility into behavior and preferences that could inform investment decisions. Organizations combining traditionally siloed data sets—property characteristics, tenant information, market trends, alternative data—are developing novel perspectives that couldn't be achieved with any single source.
Beyond raw data, the development of sophisticated analytical capabilities will emerge as a critical source of competitive advantage. The most successful firms won't just collect information—they'll transform it into actionable insights through proprietary models, algorithms, and intellectual property. Take for example office owners, for whom the shifting preferences of tenants has caused a bi-furcation of the asset class into “Class A” and “Uninvestable”. Better data can quantify previously idiosyncratic information like lobby quality and elevator speed into optimizable characteristics, providing an investment roadmap for making properties competitive.
Advanced machine learning techniques that can forecast rental rates, vacancy trends, and property values will enable more proactive investment strategies.
The application of artificial intelligence to property valuation offers a particularly compelling example of this trend. Traditional appraisal methods, based on limited comparable sales and periodic assessments, are giving way to continuous valuation models that incorporate dozens of factors and update in real time as conditions change. Whereas now decisions are made using MSA or Zip Code level data, firms like Smith’s ProfitIQ are generating highly accurate forecasts with 500 square meter precision. These sophisticated approaches to measuring, pricing, and managing real estate risk will create opportunities for new investment products and risk transfer mechanisms. These models don't just provide more frequent or convenient valuations—they fundamentally change how assets are priced, portfolios are managed, and how lending is underwritten by incorporating more information and updating more dynamically than human appraisers ever could.
Getting to an End-State for CRE Data Activation
How legacy providers respond to these changes, as well how new entrants will develop and deliver new products is an open question. Current distributors will develop proprietary analytics to move up the value curve, while activators will seek to control more of their data inputs to create defensible moats. The most successful platforms will create APIs and developer tools that enable third-party applications and services. The days of “dashboards” and “terminals” is giving way to a new paradigm where firms want to access, manage and manipulate data on their own. A slicker UX is no longer a competitive advantage nor sustainable moat.
This presents a fascinating strategic inflection point for firms that possess proprietary valuation models and alternative data capabilities. As these technologies mature, companies face a pivotal question: should they monetize their intellectual property by selling analytics services to the broader market, or should they deploy these insights exclusively for their own investment activities? The answer depends on several critical considerations that reflect the evolving dynamics of the CRE data landscape.
For specialized analytics firms with sophisticated modeling capabilities but limited capital resources, the platform model offers compelling advantages. By selling their insights to multiple investment firms, they can generate recurring revenue streams while benefiting from scale and network effects as their models improve with additional data. Companies like Green Street have built substantial businesses offering proprietary analytics to a wide range of investors, creating value through the breadth of their market intelligence rather than the depth of their capital deployment.
However, the largest asset managers increasingly recognize that their most valuable data innovations may be too strategically important to share. Firms like Blackstone and Brookfield have invested heavily in proprietary data platforms and analytical capabilities that give them information advantages when evaluating acquisition targets, optimizing portfolios, and timing market entries and exits. For these giants, the arbitrage opportunity from being first to identify mispriced assets or emerging market trends far outweighs the potential revenue from selling that intelligence to others. Their massive capital bases allow them to fully exploit their analytical edge at scale, transforming information asymmetry directly into alpha rather than subscription fees.
This bifurcation suggests an emerging "barbell" market structure in CRE analytics. On one end, specialized data providers will offer broad market intelligence and standardized metrics that benefit from wide distribution. On the other, the largest players will continue to develop increasingly sophisticated proprietary systems that remain entirely captive, creating moats around their investment strategies that smaller competitors cannot easily cross. The middle ground of firms with valuable proprietary insights but insufficient capital to fully exploit them will face difficult strategic choices about whether to monetize their IP through services or seek partnerships that allow them to participate more directly in investment returns.
The end state likely mirrors what occurred in public equity markets, where some quantitative insights became widely available through legacy vendors, while the most sophisticated hedge funds kept their most valuable signals and models strictly proprietary.
As commercial real estate continues its journey from relationship-driven decisions toward data-driven allocation, this tension between sharing and hoarding analytical capabilities will shape the competitive landscape for decades to come, determining which organizations capture the greatest value from the ongoing digitization of the world's largest asset class.
—Hunter