Using loyalty card records and machine learning to understand how self-medication purchasing behaviours vary seasonally in England, 2012–2014
Alec Davies; Mark A. Green; Dean Riddlesden; Alex D. Singleton (2024). Applied Marketing Analytics: The Peer-Reviewed Journal, 5(4), 354. DOI: 10.69554/MJDF4395
Abstract
This paper examines objective purchasing information for inherently seasonal self-medication product groups using transaction-level loyalty card records. Predictive models are applied to predict future monthly self-medication purchasing. Analyses are undertaken at the lower super output area level, allowing the exploration of ˜300 retail, social, demographic and environmental predictors of purchasing. The study uses a tree ensemble predictive algorithm, applying XGBoost using one year of historical training data to predict future purchase patterns. The study compares static and dynamic retraining approaches. Feature importance rank comparison and accumulated local effects plots are used to ascertain insights of the influence of different features. Clear purchasing seasonality is observed for both outcomes, reflecting the climatic drivers of the associated minor ailments. Although dynamic models perform best, where previous year behaviour differs greatly, predictions had higher error rates. Important features are consistent across models (eg previous sales, temperature, seasonality). Feature importance ranking had the greatest difference where seasons changed. Accumulated local effects plots highlight specific ranges of predictors influencing self-medication purchasing. Loyalty card records offer promise for monitoring the prevalence of minor ailments and reveal insights about the seasonality and drivers of over-the-counter medicine purchasing in England.
Extended Summary
This research investigates whether loyalty card data from retail pharmacies can predict seasonal patterns in over-the-counter medicine purchasing across England. The study analysed anonymised transaction records from approximately 15 million customers of a national high street retailer between 2012-2014, focusing specifically on hay fever medicines and cough and cold treatments. Data were aggregated to Lower Super Output Area level (administrative areas containing around 1,500 people) to examine purchasing patterns alongside nearly 300 potential predictors including weather data, air pollution levels, demographic characteristics, and socioeconomic factors. The research employed machine learning techniques, specifically XGBoost (Extreme Gradient Boosting), to build predictive models using one year of historical data to forecast future monthly purchasing patterns. Two modelling approaches were compared: static models trained on a fixed 12-month period, and dynamic models that retrained monthly with updated information. Clear seasonal patterns emerged for both medicine categories, with cough and cold purchases peaking in winter months (particularly December) and hay fever medicines showing highest sales from March to September. The dynamic retraining approach generally performed better than static models, achieving R-squared values between 0.5-0.7, though struggled when purchasing behaviour differed significantly from previous years. The most important predictors consistently included previous month sales of the same or related products, temperature, and seasonal indicators. Temperature played different roles for each condition: for hay fever, optimal pollen release temperatures (10-15°C and around 19°C) increased purchasing, while cough and cold medicines showed elevated sales during slightly warmer periods (2.5-7.5°C), possibly reflecting virus transmission patterns. Interestingly, median age of loyalty card holders in areas proved important, with people aged 35-60 showing highest purchasing rates, likely reflecting parents buying medicines for children or replenishing family stocks. However, other demographic and socioeconomic factors showed limited predictive importance, contrary to previous research suggesting social inequalities in self-medication behaviours. Environmental factors beyond temperature had minimal impact, with only sulphur dioxide pollution appearing as an important predictor for respiratory medicines. This research demonstrates the potential for loyalty card data to complement traditional disease surveillance systems by providing real-time insights into minor ailment prevalence across communities. Such data could enable earlier detection of seasonal health trends and support public health planning, particularly as they capture self-medication behaviours that bypass formal healthcare systems.
Key Findings
- Clear seasonal purchasing patterns emerged with cough medicines peaking in December and hay fever treatments highest from March to September.
- Dynamic machine learning models outperformed static approaches, achieving R-squared values of 0.5-0.7 for predicting monthly medicine purchasing patterns.
- Temperature proved consistently important, with specific ranges (10-15°C, 19°C) increasing hay fever medicine sales corresponding to optimal pollen release conditions.
- Previous month sales and seasonal indicators were the strongest predictors, while demographic and socioeconomic factors showed surprisingly limited influence.
- Loyalty card data from retail pharmacies offers potential for real-time disease surveillance complementing traditional public health monitoring systems.
Citation
@article{davies2024using,
author = {Alec Davies; Mark A. Green; Dean Riddlesden; Alex D. Singleton},
title = {Using loyalty card records and machine learning to understand how self-medication purchasing behaviours vary seasonally in England, 2012–2014},
journal = {Applied Marketing Analytics: The Peer-Reviewed Journal},
year = {2024},
volume = {5(4)},
pages = {354},
doi = {10.69554/MJDF4395}
}