A vibrant illuminated globe display showcasing technological advancements at a science museum. by Denys Gromov via pexels
.
Housing has a data problem
.
Brookings.education – March 2, 2026
The lack of basic tools to track and understand housing has resulted in a patchwork of individual programs and little clarity on whether any of them meet basic access and affordability needs. The promise of AI, which requires structured, standardized inputs, makes addressing this data-infrastructure gap more urgent.
.
According to the latest United Nations estimates, 2.8 billion people worldwide lack access to adequate housing, while 318 million are homeless. Despite investing billions of dollars in solutions, governments and philanthropies have been unable to make a dent in the crisis.
An underappreciated reason for this is the lack of basic infrastructure to track and understand baseline questions concerning housing. Major data gaps mean we often don’t know which parcels of public land sit idle, how many units are vacant, and where development proposals stall. And without common definitions for fundamental terms, it becomes difficult to make comparisons across contexts – “affordable housing” means one thing in London, another in Lagos, and something else entirely in Los Angeles. Worse, the data that do exist are rarely accessible to policymakers and researchers.
In most cities, no single authority is responsible for tracking which public entity owns which parcel of land. Transit agencies, school districts, and planning departments each hold fragments of information that never connect. Zoning codes vary widely, not just between countries but also between neighboring municipalities.
This fragmentation produces bad policy. Without a full picture of the available resources and the factors that affect housing supply, policymakers cannot reliably identify effective interventions. As a result, a city might invest heavily in subsidized construction while sitting on publicly owned land that could be developed more cheaply. Governments set ambitious housing targets but are unable to track progress or remove bottlenecks, which effectively shields them from any real accountability. The result is a patchwork of individual programs and little clarity on whether any of them meet basic access and affordability needs.
Many hope that AI will finally crack the housing challenge. Machine-learning models can now reconcile disparate databases, detect underutilized land through satellite imagery, and simulate how policy changes might affect housing supply. But these tools require structured, standardized inputs. Realizing the technology’s potential therefore depends on the unglamorous work of data engineering. That makes building this infrastructure even more urgent.
For example, a pilot by the Urban Institute and the Legal Constructs Lab at Cornell University to automate National Zoning Atlas methodologies found that machine-learning models could not reliably interpret zoning documents, owing to inconsistent formatting, legal nuance, and local exceptions. Cities worldwide have experienced what practitioners call the “dashboard valley of death”: expensive visualization tools that fail because the underlying data infrastructure cannot sustain them.
The contrast with successful scientific infrastructure is instructive. The Human Genome Project helped transform the way scientists diagnose and treat disease in part by establishing the Bermuda Principles, which require participating laboratories to release DNA sequences within 24 hours. This ignited a wave of collaboration that later enabled breakthroughs like CRISPR and AlphaFold. After researchers shared SARS-CoV-2 genomes in early 2020, vaccines were developed at unprecedented speed.
A group of experts across housing policy, data infrastructure, and governance recently gathered as part of the 17 Rooms Initiative to discuss this problem. It was agreed that housing needs a similar mechanism: a “Home Genome Project” for standardizing and sharing housing data and AI models globally.
Such a mechanism will require, first, common taxonomies for parcels, zoning types, vacancy definitions, and development stages, designed for interoperability rather than vendor lock-in. Second, cities should share their models and datasets far and wide, enabling genuine comparison of what works across contexts. Third, standards and tools must be accompanied by a playbook for institutional capacity building, including data governance, cross-agency coordination, and the analytical capabilities needed to translate data into decisions.
To be sure, housing data presents challenges that genomics did not. DNA follows universal biological rules; by contrast, housing varies according to regulatory and political environments. While some variability is necessary to reflect local conditions, much more data can and should be standardized, which will require collaboration, not top-down mandates. Built for Zero has helped more than 150 communities make measurable progress on homelessness through shared data protocols and coordinated action, demonstrating that collective infrastructure can be built to address complex problems.
Philanthropists seeking to strengthen communities, policymakers pursuing housing targets, and technologists developing sector-specific AI models all face the same bottleneck: the data foundation does not exist. Building this infrastructure is not as exciting as funding an app or announcing a new initiative. But without it, allocating resources effectively and learning from experience is impossible. It is as though we were attempting precision medicine with medieval anatomy charts.
The Human Genome Project was a 13-year global undertaking that created an industry worth trillions of dollars. A comparable investment in housing data infrastructure could finally let us see what works, fund what scales, and unlock solutions we cannot yet imagine.
*
*
Discover more from MENA-Forum
Subscribe to get the latest posts sent to your email.