Data infrastructure requirements for new geodemographic classifications: The example of London’s workplace zones
Alex D. Singleton; Paul A. Longley (2019). Applied Geography, 109, 102038. DOI: 10.1016/j.apgeog.2019.102038
Abstract
In recent years a mix of Open Data and commercial sources have been used to build geodemographic classifications of neighbourhoods. In this paper we argue that geodemographics are coming to embody new thinking about the emergent mixed Big Data economy. This has implications for openness and full scientific reproducibility of classifications, as well as the engagement of stakeholders in the process of building classifications. We propose and implement an operational framework for blending open and other data sources that can stimulate development of classifications that are more timely and data rich yet sufficiently open to peer scrutiny. We illustrate these ideas and challenges by describing the creation and content of the London Workplace Zone Classification.
Extended Summary
This research explores how to create neighbourhood classification systems that combine open government data with proprietary commercial datasets while maintaining scientific transparency and reproducibility. The study addresses the challenge of creating geodemographic classifications (systems that categorise small areas based on their social, economic and demographic characteristics) in an era where valuable consumer data is increasingly controlled by private companies rather than being freely available. The paper proposes a hybrid framework that enables researchers to access and analyse both public and restricted datasets within secure environments whilst preserving methodological transparency. The methodology involves a three-tier data access system: Public data that requires no restrictions, Safeguarded data requiring user registration and research protocols, and Controlled data that must be analysed within secure facilities. This framework was implemented to create the London Workplace Zone Classification, which categorises areas based on employment characteristics rather than residential patterns. The research utilised 2011 Census data supplemented with commercial retail intelligence data, business register information, and transport accessibility measures. Using k-means clustering analysis, five main workplace clusters were identified: Residential Services (local community-focused employment), City Focus (specialised professional activities), Infrastructure Support (transport, utilities and retail), Integrating and Independent Service Providers (high self-employment areas), and Metropolitan Destinations (international service centres). Each cluster was further subdivided into 11 sub-groups providing more detailed workplace characterisation. The classification reveals London’s unique employment geography, with most areas falling into just two categories in the national system but showing much greater diversity when analysed separately. The findings demonstrate that hybrid approaches can successfully balance the competing demands of using the best available data whilst maintaining sufficient openness for academic scrutiny. This work has significant implications for urban planning, transport policy, and economic development strategies. The framework enables the creation of more timely and comprehensive area classifications that can inform public service delivery and commercial decision-making whilst respecting data privacy regulations and commercial interests.
Key Findings
- A hybrid framework successfully combines open government data with restricted commercial datasets whilst maintaining scientific transparency and reproducibility.
- London’s workplace geography requires separate classification from national systems, with five distinct employment clusters identified across the capital.
- The three-tier data access system (Public, Safeguarded, Controlled) enables analysis of sensitive datasets within secure environments without compromising research integrity.
- Consumer data from retail intelligence and transport systems significantly enhances geodemographic classifications beyond traditional census-based approaches.
- Workplace zone classifications reveal functional characteristics during working hours, complementing traditional residential-based neighbourhood analysis for planning applications.
Citation
@article{singleton2019data,
author = {Alex D. Singleton; Paul A. Longley},
title = {Data infrastructure requirements for new geodemographic classifications: The example of London's workplace zones},
journal = {Applied Geography},
year = {2019},
volume = {109},
pages = {102038},
doi = {10.1016/j.apgeog.2019.102038}
}