Looking at links between obesity and location

Our food and drink decisions are informed by the places we live, work and learn. What is available, convenient and affordable, as well as how products have been marketed, promoted and packaged, depends on location.

In this project, we have explored how we can combine publicly available data and analyse it with data science methods to group locations into different categories depending on their similarities. This involves a machine learning technique called “clustering” that automatically groups locations together, in this case when they share socio-demographic characteristics and environmental factors. We chose this approach because we thought it could help uncover hidden patterns in the data that could be hard to detect manually.

What did we do?

First we looked for data that might be useful for clustering locations. We made decisions based on the availability, relevance, ease of access and geographic detail (eg, the level to which local area data is broken down – regions, councils, boroughs or wards). Some of the data sources we used include National Child Measurement Programme (NCMP) childhood obesity prevalence, CDRC Healthy Assets and Hazards dataset (capturing access to food environments and health-related environmental factors in local areas), ONS Median house prices paid for England and England Health Indices of Multiple Deprivation.

Having collected, cleaned and merged these data, we used data science to cluster locations in England based on their similarities and differences and used interpretable machine learning methods with the goal of identifying the factors that underpin the differences across clusters. Our geographical unit of analysis is a small official geography called “Lower Layer Super Output Area” (LSOA) with an average population of 1,500 people.

What did we learn?

The map below shows the clusters that different LSOAs in London are assigned to. It shows that most neighbourhoods in the city are in clusters five (blue: dense urban areas with high prevalence of child obesity) and one (orange: wealthier areas with lower prevalence of child obesity). These preliminary results illustrate geographical inequalities in child obesity as well as some of their potential drivers.

Cluster map of London by colour

Read the text-based description of this image

Our analysis of the clusters shows that measures of deprivation, house price and pollution are the most significant factors behind their differences. Access to the retail environment, GP and other physical places were also important. The findings are not entirely unexpected, as other studies have shown the variation in the rates of obesity by local areas. However, by creating this model, we are able to use data to start understanding the differences between locations that might be leading to certain outcomes.

Although our work is at an early stage, it suggests that geography matters when it comes to obesity. Whether a place is close to the city centre or not, urban or rural is an important factor and may be linked to the time it takes to travel to different parts of the food environment, and the type and volume of retailers in an area.

"Although our work is at an early stage, it suggests that geography matters when it comes to obesity."

What's next?

We want to expand our data by adding in information about food retailers and purchasing as well as adult obesity. We will use these new data and a fine-tuned methodology to segment neighbourhoods into groups with the goal of understanding the local drivers of obesity that could inform policy interventions.

Author

Roberta Sgariglia

Roberta Sgariglia

Roberta Sgariglia

Lead Data Scientist, Data Analytics Practice

Roberta was a Lead Data Scientist in the Healthy Life Mission team. She worked closely with other teams to implement data science projects that concern health and mental health.

View profile
Jyldyz Djumalieva

Jyldyz Djumalieva

Jyldyz Djumalieva

Data Science Technical Lead, Data Analytics Practice

Jyldyz Djumalieva was the Data Science Technical Lead working in Data Analytics

View profile
Elena Mariani

Elena Mariani

Elena Mariani

Principal Data Scientist, healthy life mission

Elena is a principal data scientist for the healthy life mission.

View profile