Creating open source geodemographics: Refining a national classification of census output areas for applications in higher education
Alexander D. Singleton; Paul A. Longley (2009). Papers in Regional Science, 88(3), 643-667. DOI: 10.1111/j.1435-5957.2008.00197.x
Abstract
This paper explores the use of geodemographic classifications to investigate the social, economic and spatial dimensions of participation in Higher Education (HE). Education is a public service that confers very significant and tangible benefits upon receiving individuals: as such, we argue that understanding the geodemography of educational opportunity requires an application-specific classification that exploits under-used educational data sources. We develop a classification for the UK higher education sector, and apply it to the Gospel Oak area of London. We discuss the wider merits of sector specific applications of geodemographics and enumerate the advantages of bespoke classifications for applications in public service provision.
Extended Summary
This research addresses how specialised geodemographic classifications can better understand participation patterns in UK higher education by developing sector-specific neighbourhood typing systems. Geodemographics involves classifying small geographical areas based on their demographic and socioeconomic characteristics to understand consumer behaviour and service usage. The paper argues that generic commercial geodemographic systems, which rely on shopping questionnaires and lifestyle data, are inadequate for understanding public service consumption patterns like higher education participation. The methodology builds upon the National Statistics Output Area Classification (OAC), an open-source geodemographic system based entirely on 2001 census data. Rather than creating an entirely new classification, this research refined the existing OAC hierarchy by adding a fourth level of 176 ‘microgroups’ and incorporating comprehensive higher education data from HESA (Higher Education Statistics Agency) and UCAS (Universities and Colleges Admissions Service). The enhanced classification includes variables measuring participation rates, A-level scores, social class composition, ethnicity, subject preferences, distance travelled to university, and previous independent school attendance. The research employed k-means clustering algorithms to create a two-tier hierarchical classification with 53 types aggregated into 10 groups. Unlike commercial systems with undisclosed weighting schemes, this classification maintains transparency in its construction methodology, making it reproducible and suitable for public sector applications. Testing the classification in Gospel Oak, North London, demonstrated its ability to discriminate between neighbourhoods with different higher education participation patterns. The Gospel Oak case study revealed significant variations in course preferences and institutional choices between different geodemographic groups. For instance, neighbourhoods classified as Group G showed higher propensity for medical studies and Russell Group university attendance compared to Group I areas. However, the research also identified limitations when applying national index scores to predict local variations, particularly systematic under-prediction of business studies and creative arts participation across all area types in Camden. These discrepancies suggest local factors such as specialised schools or prestigious institutions with strong community links influence participation patterns beyond what demographic data can capture. The broader significance of this work lies in challenging the assumption that private consumption patterns adequately predict public service usage. The research demonstrates how sector-specific data can enhance geodemographic discrimination for public policy applications, offering more appropriate tools for universities, schools, and education authorities to understand their potential markets and target widening participation initiatives. This approach has important implications for social equity in public service allocation, as transparent classification methods allow scrutiny of how resources are distributed across different communities. The methodology could be extended to other public services where comprehensive administrative data are available, providing more relevant alternatives to commercial geodemographic systems in public sector decision-making.
Key Findings
- Developed first geodemographic classification specifically designed for higher education analysis using comprehensive sector data from HESA and UCAS.
- Enhanced the National Statistics Output Area Classification by creating 176 microgroups and 53 types incorporating education-specific variables beyond census data.
- Demonstrated significant variation in course preferences and university choices between geodemographic groups in the Gospel Oak London case study.
- Identified limitations of national index scores for local prediction, with systematic under-representation of business studies and creative arts subjects.
- Established transparent, reproducible methodology for public sector geodemographics as alternative to commercial systems with undisclosed weighting schemes.
Citation
@article{singleton2009creating,
author = {Alexander D. Singleton; Paul A. Longley},
title = {Creating open source geodemographics: Refining a national classification of census output areas for applications in higher education},
journal = {Papers in Regional Science},
year = {2009},
volume = {88(3)},
pages = {643-667},
doi = {10.1111/j.1435-5957.2008.00197.x}
}