In December 2017 the UK topped a new index of countries as the most prepared to bring Artificial Intelligence to government. According to the AI readiness index, the UK was in a better position than the US, Korea, and Japan, to apply the new technology to its public services.
UK Prime Minister Theresa May immediately greeted the news. It is easy to see why. For many, AI is the new frontier for government. With machine learning – programming computers to adapt to changing circumstances, learn from them, and alter their actions accordingly, AI could transform a wide variety of public services – in just one example, hospitals in Queensland, Australia used it to predict the number of incoming emergency patients, saving as much as $80 million for the state.
One of the reasons for the UK’s AI strength is its public sector innovation body, the Behavioural Insights Team (BIT). Over the past year, BIT has run eight separate machine learning experiments across separate, and varied, branches of the UK government: small, practical experiments, where civil servants can trial AI in front-line services, fail, learn and improve, all in a space of a few months. Crucially, in all of these cases, public servants aren’t replaced by artificial intelligence, but work in tandem with it to provide better, and more targeted services.
“With an entrepreneurial approach, we can show that you can see real impacts coming out of stuff pretty damn quickly”
“What we wanted to do was work very quickly to try and produce a small number of exemplars,” said Michael Sanders, Chief Scientist and Head of Research and Evaluation at the (BIT). “When people think about AI and machine learning, or data science, I think there’s too much of a sense that this is a huge investment in data infrastructure where you’ll start to see the first working model in 3-4 years time.”
“These models are not perfect: we’ll probably not get them to work the first time or probably even the hundredth time. But we can fail quickly, and, with an entrepreneurial approach, we can show that you can see real impacts coming out of stuff pretty damn quickly.”
An extra pair of eyes
BIT ran machine learning trials in areas ranging from healthcare, to education and social services. Among these, one of the most successful was a tool to help social workers predict whether the children they were working with were at risk of future abuse. Social workers in the UK are under considerable pressure, and typically have up to 50 cases to manage at the same time. Through providing them with a tool to quickly and accurately assess risky aspects of a particular case, social workers could catch the cases which they might otherwise wrongly mark as closed, and which would later be referred back to child protection.
“My wife is a child protection social worker, for me it’s probably the most exciting, but also the most challenging exemplar,” said Sanders. “This question about how we make it so that the tool that we’re giving people is one that social workers can and will use, feel confident using, and that we’re not baking in mistakes from the beginning.”
“This tool should not be used to beat them around the head with but as a thing that they can use to try and make their own lives easier and better”
Data scientists at BIT built a tool to analyse the language of the thousands of historic closed cases judged to require no further action, to work out which of them later returned to child protection or worsened. The tool extracted themes from social workers’ case notes, which were then analysed en masse to reveal if specific language patterns used by social workers help to predict those cases which were closed prematurely.
“This tool should not be used to beat them around the head with like a stick but as a thing that they can use to try and make their own lives easier and better, whilst also improving the outcome for young people,” said Sanders. Through analysing recurring patterns in the language social workers used, the aim was to reveal the wrongful and accidental assumptions made as they work through cases.
“One of the great things about machine learning is that it doesn’t allow you to make as many assumptions about the relationships between variables as traditional analysis does,” said Sanders. “Hopefully people respond positively to that.”
The algorithm was able to identify a small (6%) set of cases that were closed as ‘high risk’, containing nearly half of the cases that would later return and escalate. Using the tool, social workers would be able to challenge their own assumptions, and reconsider cases they may miss under pressure.
BIT also experimented using AI to tailor public services to specific demographics according to their race, gender, or economic status.
“What we want for the next wave of movement towards what we expect government to be, is not just ‘what works’, but ‘what works for whom’,” said Sanders.
Randomised control trials (RCTs) are what policy makers typically use to test the impact of a particular policy on different subgroups. In the past however these have tested one or two separate characteristics, for instance the specific impact of an intervention on women, or children; they are less useful at working out the effect of a policy or intervention when many such characteristics are combined.
BIT partnered with Kings College London in 2016 to run an RCT testing the effect two different text messages had on attracting new students to the university society fair. One emphasised the social benefit of joining a society, the other the benefit for future job prospects. They discovered that social messages emphasising belonging were marginally more effective than the professional, boosting attendance by 6% and 5% respectively.
“We can make the world a slightly better place with personalisation, but we have to overcome these problems”
In the following year BIT used this data to train an algorithm. Rather than randomly allocating the two different messages to the two sets of students, the algorithm crunched the previous year’s data to see which student was most likely to respond to which particular message. They then tested it by sending half the text messages randomly, while the other half was determined by what the algorithm recommended.
The trial wasn’t particularly successful. 60.5% of those contacted by the algorithmically allocated message attended the fair, compared to 60.1% of those contacted randomly. Nevertheless, should the algorithm be refined, it demonstrates the possibility of assessing the different impacts of an assessment across all characteristics of a group. It would allow individuals, no matter their identity, to be targeted with interventions best suited to them, and improving outcomes for those who fall through the cracks when policy is made.
“Obviously it needs to be handled really carefully, so you don’t end up in a discriminatory world where people are not receiving public services one way or another, said Sanders. “We can make the world a slightly better place with personalisation, but we have to overcome these problems. The fact that there are problems with models, and nothing is perfect, is not a good enough reason for us to sit on the sidelines and say that’s too difficult.”
Avoiding bias and escaping the algorithm
There is a growing concern among the public and policy makers themselves that artificial intelligence could perpetuate the biases which exist in society. New York recently announced an algorithm task force, designed to oversee the AI systems used by the city’s agencies to automate its processes. The UK has announced its own Centre for Data Ethics and Innovation to ensure high ethical standards when government works with AI. The fear is that human biases hidden in historical data could be locked into the algorithms which are built on it, and perpetuate these biases in a much more systematic fashion.
“There are always risks of baking bias into the process,” said Sanders. “We are doing what we can to try and stop that – that’s part of the reason why we’ve not just deployed everything into the field.”
By consulting with practitioners and working with open data sets BIT has tried to make its work as transparent as possible, without jeopardising the purpose of the work itself. While bias is a risk that transparency can solve, too much transparency could result in public servants performing to boost the standards a particular algorithm measures, rather than focusing on the real purpose of their roles.
“I think we have to think about how public and how transparent these things are,” said Sanders. “We think it’s absolutely important to be engaging with social workers, teachers and clinicians on this stuff, but I don’t want to see a world where people start chasing the algorithm, for want of a better phrase.”
Achieving a balance between improving services, revealing rather than exacerbating biases, and preventing new and harmful behaviour is difficult, and requires public servants to think carefully about how AI is applied. The experimental approach, in which safeguards and transparency are built in to the process, allows policy makers and the public to assess the value of innovation for themselves. It could be the key to realising the great promise of AI in government.
(Picture Credit: Flickr/Walter)