London’s last great smog covered the city in December 1952. Fumes from coal burnt by Londoners were pushed back down to street level by an anticyclone above the city: for four days, all who lived there inhaled a toxic mix of smoke particles, hydrochloric acid and, most dangerously of all, sulphuric acid. By the time the fumes lifted on December 9th, 4,000 people had died as a result of the pollution.
The Great Smog is still considered the worst air pollution event in European history. Since then, thanks in part to legislation brought forward in the years after the crisis, such catastrophic events have disappeared in the UK. The nitrogen dioxide levels in London’s air believed to have led to the smog becoming so deadly have fallen significantly. Despite this, air quality in the city, as in many others, leads to thousands of premature deaths per year. In 2015, air pollution levels were deemed so bad as to be illegal under EU law.
In response, London devoted £20 million ($28 million) in 2015 to bringing down pollution significantly in a decade. Now, through one of the capital’s flagship schemes, the Clean Air in London Project, data scientists from the Alan Turing Institute will work with the Greater London Authority (GLA) to create a tool to track the changing patterns of pollution across the city in minute detail, using data crowdsourced from Londoners themselves – thousands of them.
“London is lucky enough to have around 100 high-fidelity sensors, the London Air Quality Network,” said Dr Theo Damoulas, Assistant Professor of Data Science at the University of Warwick and organiser of the project, which is funded by the Lloyd’s Register Foundation via the Turing program on Data Centric Engineering. “At the same time, there are a lot of emerging technologies on the sensing side that provide for the first time the opportunity to deploy large scale networks of sensors, which might be less accurate in some cases, but also low cost: therefore you could have access to thousands.
“We know that air pollution is driven a lot by traffic… there are another 50-60% of emissions sources that are outside traffic and are unaccounted for.”
The project is an ambitious attempt to source data directly from citizens via an online platform. Damoulas’s team will create a system to pull air quality data from sensors across London, using the existing network, but adding to it with air quality data from environmental organisations and using data sourced from citizens themselves using sensors provided by the team. While they currently have access to data from 100 sensors, this would pull in readings from thousands. By combining with other datasets, the team will use this to reveal the factors which cause air pollution to the level of the city’s specific streets, and will allow them to measure the impact of new policies.
It’s an example of a “citizen science project”, a model which has emerged in recent years as scientists have recognised the power of the internet as a research tool. By using crowds willing to collect data on their behalf, these programs are able to save scientists and researchers the time and effort they would otherwise need to pull in vast amounts of information.
“It will involve Londoners, who are really interested in air pollution, and have been over many years”
“We want to advance the state of the art here, because we can exploit it to effectively play little games: offer some reward if you sample those locations for us,” said Damoulas. “It could be a very large citizen science project, where we give away some sensors, you attach them to your mobile phone, and you do some sampling for us.”
eBird, a platform launched by ornithologists and computer scientists at the Cornell Lab of Ornithology in the US, is probably recent history’s standout citizen science project. An international database of bird sightings from common breeds to the rarest endangered species, it tracks the movement of bird populations across the planet using data submitted by the public.
From its beginnings in the early 2000s, eBird has grown to become the world’s largest biodiversity-related citizen science project, with over 100 million bird sightings contributed each year by the platform’s users. The platform’s creators saw the opportunity to capitalise on the international community of hobbyist “birders”, offering them useful features such as lists and mapping functions to draw them in and encourage them to submit data, alongside the sense of shared thinking that comes from contributing to a global science project.
Damoulas previously worked as a research associate on eBird’s team and helped to develop some of their methods. “We think it’s going to be fantastic because it will involve Londoners, who are really sensitive and really interested in air pollution in London, and have been actually over many years,” he said. “What if you could choose where to place the sensor? We can speak about how you can guide citizens to sample locations that are of interest – those might be areas for instance where my model has the highest uncertainty, where I want more information.”
In order for the project to be successful, Damoulas’s team needs to provide citizens with incentives to collect air quality data on their behalf. To this end, they are working with environmental groups such as Friends of the Earth and Greenpeace to find a willing audience. Existing networks show that there is appetite among sections of the public for grassroots science. The Things network, a group of hobbyists from around the world who use Internet of Things (IoT) devices to build sensing networks, is a good example.
Forecasting pollution and testing policies
Finding ways to collect new data makes up only part of the team’s remit; the value for the city comes from what they can do with it. By combining the air quality data with other datasets, both open data and some of the GLA’s own, Damoulas plans to build a high-resolution analysis platform, which will allow them to understand the causes of air pollution, and from this, predict where it will occur.
Using machine learning, his team will be able to analyse the causes of the build-up of air pollution in specific areas and at particular times. Damoulas said they plan to “do what we call hyperlocal estimation of air pollution: high-resolution estimates, dynamic models that can track it in space and time – so that you can produce not just a map that’s very static, but really be able to provide high-resolution forecasts and real-time coverage.”
“We’re interested in detecting changes. We have methodologies that would monitor online all of the sensors that would actually pick up what we call “change points”; when the process has drastically changed from what it was before. That’s very interesting for policy evaluation.”
Damoulas’s team meets with policymakers at the GLA once a week to find ways to use the findings to make policy. The key is being able to understand what causes air pollution to fluctuate across the city. Rather than have a static map of air pollution, the team aim to provide a tool which will allow government to test the impact of specific interventions in minute detail.
“We and the GLA are very interested in testing the effect of the policy, and not just at an aggregate level,” said Damoulas. “Which areas were affected, what was the effect of introducing this specific policy? How does that connect to socio-economic characteristics? Are you breathing more pollution because you are poor?”
The challenges ahead
London isn’t the only city where government is trying to improve its understanding of air quality. In Chicago, authorities have partnered with local universities to set up the “Array of Things”, a network of air quality sensors across the city that provide policymakers and the public with a “fitness tracker” for the city.
Two flagship smart city projects, one run by Alphabet subsidiary Sidewalk Labs on the waterfront in Toronto and another by Bill Gates’s Belmont project in Arizona, will install air quality sensors as they build, allowing them to track pollution and test out ways of dealing with it in the same way London proposes.
“We’re interested in detecting changes – that’s very interesting for policy evaluation.”
The Clean Air in London project is still in the opening stages of a two year partnership between the GLA and the Alan Turing Institute. By engaging the public itself, London’s plans to avoid the expense of installing new sensors and win the flexibility to test specific ideas. Its success rests on whether or not it can build a community of engaged contributors: it took eBird years of trial and error to find the right combination of incentives to engage contributors.
Due to London’s complex governmental structures – services are run by 32 different boroughs, each with their own data sources – gaining access to the supplementary data which will allow Damoulas’s team to run different analyses is difficult.
Even with a sophisticated system in place to allow air quality to be monitored in high resolution, reducing pollution will depend on the strength of the policies proposed. Some of these will involve difficult conversations, and likely conflict. Regardless, with the new platform, government will be able to tell quickly, and accurately, which hit the mark, and which don’t.
(Picture credit: Flickr/David Holt)