A Modified DBSCAN Clustering Method to Estimate Retail Center Extent
Michalis Pavlis; Les Dolega; Alex Singleton (2018). Geographical Analysis, 50(2), 141-161. DOI: 10.1111/gean.12138
Abstract
This research introduces a new method for the identification of local retail agglomerations within Great Britain, implementing a modification of the established density based spatial clustering of applications with noise (DBSCAN) method that improves local sensitivity to variable point densities. The variability of retail unit density can be related to both the type and function of retail centers, but also to characteristics such as size and extent of urban areas, population distribution, or property values. The suggested method implements a sparse graph representation of the retail unit locations based on a distance‐constrained k‐nearest neighbor adjacency list that is subsequently decomposed using the Depth First Search algorithm. DBSCAN is iteratively applied to each subgraph to extract the clusters with point density closer to an overall density for each study area. This innovative approach has the advantage of adjusting the radius parameter of DBSCAN at the local scale, thus improving the clustering output. A comparison of the estimated retail clusters against a sample of existing boundaries of retail areas shows that the suggested methodology provides a simple yet accurate and flexible way to automate the process of identifying retail clusters of varying shapes and densities across large areas; and by extension, enables their automated update over time.
Extended Summary
This research develops an automated method to identify and map retail centres across Great Britain, addressing the challenge of defining shopping area boundaries that constantly evolve due to changing consumer behaviour and economic conditions. The study tackles a significant problem in retail geography: existing town centre boundaries from 2004 have become outdated, and manual approaches to updating them are impractical at a national scale. The research used a comprehensive dataset of 437,260 retail and service locations across Great Britain, collected by the Local Data Company in 2015, providing building-level accuracy for shops, restaurants, and service businesses. Rather than relying on outdated boundaries, the methodology employs a modified version of DBSCAN (Density-Based Spatial Clustering of Applications with Noise), a computer algorithm that groups nearby retail units into clusters representing shopping areas. The key innovation lies in addressing DBSCAN’s limitation with varying densities - high street shopping areas have different retail densities compared to city centres or suburban shopping districts. The modified approach creates a network representation of retail locations, then breaks this into smaller, more homogeneous areas before applying the clustering algorithm with locally-appropriate parameters. This allows the method to accurately identify both compact shopping centres (like Wolverhampton city centre) and linear high street developments (like Clapham Junction in London). The research evaluated five different clustering methods across eight representative case study areas including Bristol, Glasgow, Cardiff, and Wolverhampton. DBSCAN consistently provided the most accurate results when compared to official local authority boundaries, whilst also being computationally efficient enough for national-scale application. The final methodology successfully identified 2,920 retail clusters across Great Britain, with results validated against independent retail boundary data from Geolytix, showing strong spatial correspondence in most areas. The automated approach offers several advantages over manual boundary definition: it provides consistent methodology across different urban contexts, can be updated regularly as retail landscapes change, and captures the full hierarchy of retail centres from major city centres to local neighbourhood shopping areas. This research provides essential infrastructure for retail analysis, enabling systematic monitoring of town centre performance, supporting planning decisions, and facilitating retail location analysis. The open-source methodology offers particular value for local authorities, retail analysts, and researchers studying urban commercial geography, providing an updatable alternative to static administrative boundaries that reflects the dynamic nature of contemporary retail landscapes.
Key Findings
- Modified DBSCAN algorithm successfully identified 2,920 retail clusters across Great Britain with superior accuracy compared to traditional clustering methods.
- The methodology addresses density variation challenges by using local parameters, enabling identification of both compact centres and linear high streets.
- Validation against independent Geolytix boundaries showed 90% spatial correspondence, demonstrating robust performance across different urban contexts.
- The automated approach provides updatable national retail centre boundaries, replacing outdated 2004 government definitions with contemporary data.
- Open-source methodology enables regular updates to retail boundaries, supporting planning decisions and commercial analysis across multiple scales.
Citation
@article{pavlis2018modified,
author = {Michalis Pavlis; Les Dolega; Alex Singleton},
title = {A Modified DBSCAN Clustering Method to Estimate Retail Center Extent},
journal = {Geographical Analysis},
year = {2018},
volume = {50(2)},
pages = {141-161},
doi = {10.1111/gean.12138}
}