This piece was written by Rebeca Moreno Jimenez, a data scientist with the Innovation Service of the UN High Commissioner for Refugees.
In the world of marketing, machine learning has been used to process large amounts of information to help improve services for customers.
In the humanitarian sector, AI applications are a new area for exploration. They can allow humanitarians, innovators, and data specialists to compile, process, and visualise huge amounts of data in seconds.
Many humanitarian emergencies are complex and first-responders often only have partial information. To have a full picture of a complex situation, many elements should be analysed. Sadly, humans can’t compile all the different information in the short timeframe needed to respond. And this is precisely where machines can help.
• Want to write for us? Take a look at Apolitical’s guide for contributors
Big Data: challenges and opportunities in the humanitarian context
Depending on the context, in order to have a full picture of a specific situation, humanitarians frequently use proxies: data points that are not directly relevant, but that provide sampled insights of some issues that are completely unknown to them.
Often these insights are found in traditional forms of data: census information, surveys, focus group discussion notes or key informant interviews.
However, additional insights can also be found in other forms of data: radio broadcasts, geospatial data, call data records, wearables and social media — just to mention a few.
The amount of data produced by these non-traditional data sources is huge and usually ‘heavy’. For example, Twitter produces an enormous amount of data in a matter of seconds. It is calculated that approximately 200 billion tweets are produced in a year (6,000 tweets per second).
The amount of energy and time that our UNHCR colleagues, particularly our communication colleagues, would need to compile and analyse results to respond to specific questions would be a challenge to their already burdensome work. Some of them have done it manually, through compiling meaningful insights.
But to scale-up this process, and most importantly, to be able to quantify it with a certain degree of statistical significance, humanitarians can rely on machines: to sample, compile, and catalogue data in real-time.
Training a machine to detect xenophobia
In 2015, the UNHCR Innovation Service partnered with UN Global Pulse, the United Nations initiative for big data analytics, to find additional insights into a rapidly-evolving setting: the Mediterranean refugee situation.
The teams used machine-learning to “find”, “read”, “compile”, and “catalogue” tweets, attempting to find movement intentions or comments on services provision that would incentivise movement. Although some comments were relevant, the sample of tweets found was not enough to provide sound mathematical-based evidence.
However, the machine found anomalous comments that were particularly exacerbated during the terrorist incidents in Europe. Every time a new incident happened — Munich, Paris, Berlin to name some of the key events — posts with a negative sentiment towards refugees appeared in different parts of the world. Sometimes these posts even had a negative association with refugees with the incidents.
The teams then re-trained the machine with a human rights-based bias: to find comments that demonstrated xenophobia. The team ‘taught’ a machine to ‘learn’ how to read, compile, categorise, anonymise, and aggregate different types of Twitter posts, in different languages and across cities, and to quantify both xenophobia and integration-friendly comments.
We drafted a White Paper to share the process and quantitative results of experimenting with machine-learning for understanding the dimension of the sentiment in the region. It could be used as evidence for humanitarian organisations preparing an advocacy campaign or drafting policy recommendations to counter xenophobia. For UNHCR teams, it could serve to direct their community-based protection initiatives, by understanding the main issues that refugees encounter when arriving into a new country.
The promise of machine learning: more questions than answers
By using machine-learning, both teams had a snapshot of evidence on questions related to integration for one region. However, in data science — where data is king — insights always produce more questions.
After analysing some of the results, the teams reflected on the following questions: A) how can we use AI for advocacy purposes in other regions? B) how can we help other agencies and organisations to use these tools in order to understand complex contexts where social media is not prevalent, or there is no electricity/connectivity? Also, when more walls are going up, C) how can we leverage AI to analyse big data and create a counter-narrative for hate speech? And finally, D) how can we translate integration and counter xenophobia in a digital world?
If you have an answer to any of these questions or would like to experiment with us, feel free to reach us. We have some ‘robots’ that could help with some of the tasks. — Rebeca Moreno Jimenez
A longer version of this article was originally published on the UNHCR Innovation Service blog.
(Picture credit: Flickr/UNHCR Photo Unit)