Predictive Peacekeeping: Strengthening Predictive Analysis in UN Peace Operations

The UN is becoming increasingly data-driven. Until recently, data-driven initiatives have mainly been led by individual UN field missions, but with António Guterres, the new Secretary-General, a more centralized approach is being embarked on. With a trend towards the use of data to support the work of UN staff, the UN is likely to soon rely on systematic data analysis to draw patterns from the information that is gathered in and across UN field missions. This paper is based on UN peacekeeping data from the Joint Mission Analysis Centre (JMAC) in Darfur, and draws on interviews conducted in New York, Mali and Sudan. It will explore the practical and ethical implications of systematic data analysis in UN field missions. Systematic data analysis can help the leadership of field missions to decide where to deploy troops to protect civilians, guide conflict prevention efforts and help preempt threats to the mission itself. However, predictive analysis in UN peace operations will only be beneficial if it also leads to early action. Finally, predictive peacekeeping will not only be demanding of resources, it will also include ethical challenges on issues such as data privacy and the risk of reidentification of informants or other potentially vulnerable people.


Introduction
We are living in the information age.Hilbert and López estimate that in 2002, the worldwide digital storage capacity overtook the total analog information capacity for the first time.As of 2007, around 94 per cent of the total amount of information was stored in digital form.Not only has the amount of available data grown, the computing power to analyze these data has also grown exponentially: computing capacity has grown by 58 per cent per year between 1986 and 2007 (Hilbert and Lopéz 2011).The amount of data collected by the UN follows this general trend.The UN is becoming increasingly data-driven.Until recently, data-driven initiatives have mainly been the result of initiatives led by individual field missions, but with the new Secretary-General a more centralized approach is being embarked on.With a trend towards the use of data to support the work of UN staff, the UN is now in a position to draw patterns from the information that is gathered in and across field missions.This paper is based on field work in New York and in the field in Mali and Sudan and will explore the practical and ethical opportunities and challenges of systematic data analysis for UN field missions -or what we term 'predictive peacekeeping.'Predictive peacekeeping refers to a range of analytic tools and peacekeeping practices that serve to forecast where and when armed violence will take place, combined with changes in peacekeeping leadership decision making -in particular deployment of peacekeeping staff -based on those forecasts.This definition draws on conceptual work on predictive policing aimed at crime prevention (see Uchida 2014: 3871).
This article adds to a growing literature on information collection and analysis efforts conducted by UN staff in the domains of early warning and prevention, peacekeeping, peacebuilding and humanitarian action.Numerous studies focus on intelligence efforts within the context of peacekeeping missions (Shetler-Jones 2008;Dorn 2016;Duursma 2017;Karlsrud 2018a: ch. 4).However, how systematic data analysis can affect the early warning practices within UN field missions, and in particular UN peacekeeping, has received little attention.In addition, this article contributes to a growing literature on early warning systems based on machine learning.There is a vast and growing body of work on how machine learning could be used to predict armed violence (Perry 2013;Blair, Blattman and Hartman 2017;Colarsei and Mahmood 2017).Yet, so far, there has been little scholarly reflection on the implications of the UN drawing on machine learning to improve its early warning capacity (for a notable exception, see Karlsrud 2014).
This article is based on actual UN peacekeeping data from the Joint Mission Analysis Centre (JMAC) in the African Union-United Nations Mission in Darfur (UNAMID), 1 and draws on interviews conducted in New York, Mali and Sudan.It examines how UN peacekeeping data can be used for predictive analysis; in our analysis we also assess how data from the Situational Awareness Geospatial Enterprise (SAGE) event database tool can be used for predictive analysis.SAGE is in the process of being rolled out by the UN Secretariat to all peacekeeping and peacebuilding field missions. 2  Predictive peacekeeping is thus both about the early identification of a threat and early action aimed at mitigating this threat.Systematic data analysis can help the leadership of peacekeeping missions to decide where to deploy troops to protect civilians, it can guide conflict prevention efforts and it can help preempt threats to the mission itself.However, an important caveat in this regard is that these benefits hinge on successfully translating early warning into early action, a seemingly perennial challenge for UN peacekeeping operations (UNGA 2014).To draw out patterns and enable prediction of violent incidents, predictive analysis will not only be demanding on resources, it will also include ethical challenges on issues such as data privacy and the risk of reidentification of informants or other potentially vulnerable people mentioned in the data.This article is organized in the following manner.The first part describes the recent turn to data-driven practices within the UN.Next, it sketches the potential role of systematic data analysis, with a specific focus on machine learning, with regards to early warning and field missions.The subsequent parts address the practical and ethical implications of the use of systematic data analysis by UN information analysts to predict armed violence.The final section concludes and emphasizes the great potential systematic data analysis holds for future peacekeeping efforts.

Necessity Is the Mother of
Invention: Current UN Data-driven Practices UN peacekeeping has evolved from being considered as an antiquated and outdated organization by Western member states to increasingly driving the adoption of new technology in the UN (Karlsrud 2018a). 3 This trend arguably started when a consensus emerged in the early 2000s that the UN should be allowed to produce more efficient field intelligence for its peacekeeping missions.The UN Department of Peacekeeping Operations (DPKO) decided in 2006 that all peacekeeping missions should have a Joint Mission Analysis Centre (JMAC) (Shetler-Jones 2008).When he assumed office in 2011, the then Under-Secretary-General for UN peacekeeping, Hervé Ladsous, embarked on a program of bringing 'the UN into the 21 st century' (UN News Centre 2017; see also UN 2017a), adding controversial elements such as surveillance and an intelligence policy to the tools of UN peacekeeping operations.These are key components of the Joint Situational Awareness Programme of UN DPKO.
The report of the High-level Independent Panel on Peace Operations (HIPPO) also stressed the role of technology and emphasized the need to strengthen the analytical capabilities of peace operations (UN 2015a).In the Secretary-General's follow-up report to the report of the HIPPO -titled The Future of Peace Operations -the Secretary-General tasked the UN Secretariat with ' developing parameters for an information and intelligence framework that can support field missions in operating effectively and safely' (UN 2015b: 94).In May 2017, the first UN Peacekeeping Intelligence Policy was established, explaining the process by which peacekeeping intelligence should be done (UN DPKO and Department of Field Support (DFS) 2018: 19).On 28 March 2018, the Secretary-General launched the Action for Peacekeeping (A4P) initiative at a Security Council high-level debate, stressing among others the need for effective intelligence to identify threats to peacekeepers.
UN peacekeeping operations have developed several tools to strengthen the quality, coordination and relevance of data, information and analysis, and ultimately the situational awareness and ability to prevent, mitigate or respond to violent incidents and protect civilians.For instance, in 2013, the All Sources Information Fusion Unit (ASIFU) within the United Nations Multidimensional Integrated Stabilization Mission in Mali (MINUSMA) was created.The creation of the ASIFU was a potentially innovative step within the context of information collection efforts in peacekeeping operations The ASIFU provided actionable and integrated intelligence products based on a comprehensive approach, which relied, among other things, on the efforts of military units that are specifically tasked to gather intelligence.The intelligence analyses conducted by the ASIFU were aimed at helping the force commander of MINUSMA to accomplish the mission's goals and mitigate threats to the mission (Karlsrud and Smith 2015;Duursma 2018a).However, in late 2017, the ASIFU was merged with the military information (U2) section of the military component of MINUSMA.MINUSMA has also developed a Ushahidi crisis-mapping platform to geotag reports of security incidents and other information and presents these in real-time maps. 4A similar reporting tool has been developed by the UN mission in the Democratic Republic of Congo (MONUSCO/DRC) and the UN mission in the Central African Republic (MINUSCA/CAR) has developed a flashpoint matrix to identify risks for physical violence against civilians, and facilitate a multidimensional response (UN 2017b).These are just a few examples of innovation in individual field missions.Nevertheless, many of these processes are still ad hoc, based on local innovation, and have significant potential for improvement.For example, these tools have little predictive capacity.
At the structural level, the DPKO has rolled out the system SAGE to track and visualize incidents. 5The UN, using in-house developers based at the UN support base in Valencia in Spain, developed SAGE, an incident and event database tool.The tool is a web-based database system that allows UN military, police and civilians in UN peace operations (both UN peacekeeping operations and special political missions) to log incidents, events and activities.SAGE is an integral and core part of the Mission Common Operational Picture (MCOP), the latter being developed during 2018.SAGE not only includes incidents pertaining to armed violence, but also information on incidents like troop movements, increased tensions, hijackings, abductions, protests and many more potentially relevant incidents.Instead of just reporting free text, the information in SAGE is stored as structured data.This means that the event is categorized (type of event, number of victims, ethnicity, number and affiliation of perpetrators, geographical coordinates and so on).Duplicates are deconflicted either at the regional or central level, enabling corroboration.Over time, the gathering of structured data enables mission leadership to identify trends and indicators for early warning.Different sections (human rights, civil affairs, justice, gender, etc.) can also insert comments that are only available to their specific section, to enable limited circulation of sensitive data.In short, while peacekeeping information-gathering efforts have been set up in an ad hoc manner to date, efforts are currently under way to set up more standard structures for information gathering within peace missions.
Why is this move towards collecting peacekeeping information so significant?One of the most significant shortfalls of UN field missions has traditionally been a lack of adequate field information.When reflecting upon the UN experience in Rwanda during the 1994 genocide, Lieutenant-General Roméo Dallaire, the force commander for the United Nations Mission in Rwanda, noted "I had no means of intelligence on Rwanda.Not one country was willing to provide the UN or even me personally with accurate and up-to-date information.[…] We always seemed to be reacting to, rather than anticipating, what was going to happen" (Dallaire 2008: 90 and 194).Other missions have encountered similar struggles (Duursma 2017).Hence, it is clear that the UN would benefit from becoming more data-driven, as this would allow UN staff to be more forward-looking and anticipate rather than react to events on the ground.Jean-Marie Guéhenno, the Under-Secretary-General for United Nations Peacekeeping Operations between October 2000 and June 2008, went so far as to describe peacekeeping as "a never-ending exercise in risk management and decisionmaking in an environment of uncertainty" (Guéhenno 2015: xv).
However, to become truly data-driven, UN field missions still have a long way to go.As an example, UN peacekeeping operations continue to be guided more by annual budgeting and reporting processes than by data collection and analysis.To promote the use of data, António Guterres, the new UN Secretary-General, has set up an executive committee and established the post of Assistant-Secretary-General for Strategic Coordination to enable the UN Secretariat to be more coordinated and data-driven and to link up with external research and data hubs. 6 What is more, although still suffering from a dearth of data in conflict areas, digitization and digitalization, 7 data convergence, and open data initiatives are increasing the range of available data.The next section explains how truly data-driven UN peace operations could use systematic data analysis to predict armed violence.

Early Warning 2.0: The Use of Systematic Data Analysis Tools within Peace Operations
Several types of predictive assessments are currently already conducted by information analysts in peacekeeping missions in order to assess the likelihood of key challenges and opportunities for the implementation of the mandate of peacekeeping missions.First of all, the JMAC analysts draft scenario-based papers in which they develop a range of likely/unlikely and best/worst case scenarios.JMAC analysts also engage in sketching trend assessments.On the basis of categorizing and arranging a set of historical incidents, trend assessments aim to sketch the wider political, social, economic and security implications of trends in relation to the mission mandate.Finally, JMAC analysts issue warning notes.This type of note is distinct from a trend analysis in that it concentrates on a single current or emerging threat that requires a timely and specific response.An early warning note always includes the who, what, how, where and when, as well as the probability of the threat materializing (UN DPKO and DFS 2018: 136-139).While scenario-based analyses, trend analyses and early warning notes all rely on data collected in peacekeeping missions in some way, information analysts in peace missions have not yet engaged in the systematic analysis of peacekeeping data through statistical modelling, let alone trying to predict events using machine learning techniques.
In this paper, machine learning is defined as 'the automated detection of meaningful patterns in data' (Shalev-Shwartz and Shai Ben-David 2014: vii), in other wordslearning and making predictions from data (Kohavi and Provost 1998).These are taking the shape of algorithms, combining a number of different factors and data streams.The difference between machine learning and statistics is that machine learning does not rely on rule-based programming, but rather detects meaningful patterns in data inductively (by examples).This means that the predictions that emerge from machine learning can be more specific, but also enable assessment of many different cases without writing a specific code to assess each problem.Housing prices are a good example -they vary according to size and location, but also design, age, access to sunlight, neighborhood and so on.By feeding a lot of cases where these factors are categorized, we can gradually improve our algorithms to predict housing prices more accurately in an automated fashion.Machine learning is often divided into supervised and unsupervised learning.Supervised learning is when an algorithm is taught the relationship between predefined categories using training data, and gradually improves its predictions.Unsupervised learning techniques do not require predefined categories and detect patterns in data themselves.
The use of machine learning in UN peacekeeping would mostly be a case of supervised learning, where algorithms are developed, tested and tweaked to constantly improve their predictive capacity.The categories in the SAGE database described earlier would be equivalent categories to the housing prices example given above.Using supervised machine learning techniques, peacekeeping analysts will be able to build their hypotheses into the supervised learning models that they structure data with.It will be important, in this regard, that there is a feedback loop between the peacekeeping analysts and the data scientists developing the machine learning algorithm.Information analysts can provide an informed theory about how a particular event is having an impact on a particular outcome, while the data scientists could reveal patterns that the analysts were not aware of.
However, it will also be possible to apply unsupervised learning, letting the computer identify patterns that analysts may not have contemplated.It could even be within the realm of the possible to use natural language processing algorithms to categorize the free text descriptions that accompany each event entry in the SAGE database -and explore whether any patterns can be found based on this category as well.
Using machine learning makes it possible to identify which variables -and crucially, which combinations of variables -included in SAGE are strong predictors of where possible attacks may take place or where intercommunal violence may be about to happen.A major advantage of using machine learning to identify combinations of risk factors is that it allows the analyst to observe how various events and developments combine to affect outcomes.For example, a tip-off of an impending attack might not be a significant predictor of armed violence, but in combination with reports on actual troop movements, it might be a highly significant predictor.Machine learning thus makes it possible to grasp the interdependence of all types of incidents reported in SAGE.In short, a necessary condition for predicting armed violence on the basis of machine learning is to have a set of variables that together explain the onset of armed violence.SAGE currently fulfills this need, as it categorizes event data.
Another necessary condition for using the data in SAGE for predictive analyses is to shift from the incident as the unit of analysis to a particular geographical area as the unit of analysis (for example, a municipality, a settlement or even a grid cell).This would make it possible to take negative cases (areas where violence is not taking place) into account, which makes it possible to determine what factors drive the onset and termination of armed violence in areas (for example, a peacekeeping deployment), as well as which factors drive the spread of armed violence from one area to another area (Beardsley and Gleditsch 2015;Duursma and Read 2017).Having an understanding of these factors would greatly enhance predictive capacity.Finally, taking a defined geographical area as unit of analysis makes it possible to control for the non-random assignment of peacekeeping staff and armed violence, since it would then be possible to take into account contextual data such as topography (mountainous areas vs. flatlands); climate change and drought; urbanization, etc.
With the necessary conditions of categorizing event data and employing a geographical area as the unit of analysis fulfilled, a key challenge for predicting the onset of armed conflict will be to deal with what has been referred as the rare event problem.As explained by Cederman and Weidmann, '[s]tandard, off-the-shelf machine learning models are typically applied to problems in which the different outcomes are relatively balanced.This is not the case for predictions of violence and peace, in which the units examined are peaceful most of the time' (2017: 475).The rare event problem is of course particularly problematic when predicting armed conflict on the country level or for predictions in countries that experience low levels of armed violence.Yet, the type of countries in which most of the contemporary peacekeeping missions operate experience medium-to-high levels of violence, though it should be noted that even in these high-intensity conflicts armed violence often clusters in space, meaning that most subnational districts do not experience conflict most of the time (Buhaug and Gates 2002;Buhaug and Gleditsch 2008).In addition, recent advances have been made in dealing with the rare event problem.Using a resampling technique, Muchlinski et al. (2016) managed to predict nine out of 20 civil war onsets correctly, while most conventional regression models failed to predict any of these civil wars.Applying these types of resampling techniques to explain patterns of armed violence on a subnational level would probably lead to higher levels of successful predictions.While it is not yet clear what level of prediction accuracy is necessary for a UN early warning system to serve as a solid basis for early peacekeeping action, several senior UN information analysts have indicated that accuracy of at least 80 per cent would be desirable from their point of view. 8 But perhaps the most fundamental challenge to predicting armed conflict in space and time will be to obtain high-quality data.Conflict processes are incredibly complex because they 'typically encompass an unwieldy set of actors interacting in surprising and, by definition, rule-breaking ways' (Cederman and Weidmann 2017: 475).It is precisely this complex nature of armed conflict that makes it important to not be too overly optimistic of the potential of predicting the onset of civil wars on a country level.The first generation of conflict prevention models within academia used to predict the onset of civil wars drew on 'sluggish' variables that changed very little from one year to the next.Examples of these type of variables are the GDP of a country, a country's level of democracy and population size.Since these variables do not change a lot, it is hard to predict change (from peace to war) in a given country with these variables.In contrast to country-level data like the GDP of a country, local-level data usually pertains to tensions and other theoretically relevant determinants of local armed violence.Hence, modelling the complex and rule-breaking behavior of armed actors on the local level, using local data, will probably work better than predicting the onset of civil wars on the country level.However, it should be noted that recent progress has been made with regard to generating more theoretically meaningful data on the country level as well.Chadefaux (2014) recently showed that using news reports makes it possible to capture political tension in a country at a particular moment in time, which allows for rather effective conflict prevention with a one-month time frame.The data within the UN system is arguably even more useful when it comes to predicting armed conflict than media reports because peacekeeping data covers many events that are not covered by the news media (Duursma 2017).This gives the UN a comparative advantage in its ability to draw on excellent data.Duursma recently showed that the JMAC conflict data on Darfur is much more comprehensive and precise than the data collected by the Armed Conflict Location & Event Data Project (ACLED).Crucially, JMAC data typically also include observations on the troop movements of armed actors or tensions identified by informants, which significantly increases the potential for early warning (Duursma 2017).The SAGE data described in this article is even more precise, particularly where the system has been rolled out and implemented across the different components and sections of the mission, such as MINUSCA.
Indeed, the use of peacekeeping data will make it possible to leverage new kinds of predictors that previously were not used when forecasting conflict on the subnational level.Cederman and Weidmann (2017: 476) warn that machine learning based on big data that is not 'theoretically informed' will probably not significantly improve conflictprediction models: Ultimately, the hope that big data will somehow yield valid forecasts through theory-free 'brute force' is misplaced in the area of political violence.Automated data extraction algorithms, such as Web scraping and signal detection based on social media, may be able to pick up heightened political tension, but this does not mean that these algorithms are able to forecast low-probability conflict events with high temporal and spatial accuracy.
Peacekeeping data, by contrast, typically includes observations that, from a theoretical perspective, should be strong predictors of armed violence on the subnational level.Rather than taking a huge number of observations (e.g., Twitter posts on a specific topic) to explain a specific event of interest (e.g., armed fighting between government forces and rebels), the use of peacekeeping data makes it possible to use a specific events (e.g., troop movements) to predict a specific event (e.g., armed fighting between government forces and rebels).
Consider the following five examples of event types from the JMAC dataset on Darfur that go well beyond observations that are typically used to predict armed violence on the subnational level, like armed clashes, violence against civilians and protests. 9First of all, the JMAC data on Darfur includes observations on troop movements, which can signal that an armed actor is mobilizing to attack another armed actor.The following is an example of an observation of troop movements: On 12 May 08 unconfirmed information received states that six ( 6 Second, the JMAC dataset on Darfur also includes observations of early warnings issued by local informants about likely impending violence in specific areas.Take, for instance, the following entry in the JMAC dataset: On 20 May 08, SLA/MM stated that according to their information GoS affiliated armed groups are mobilizing with the purpose to attack several SLA/MM controlled villages, including Muhajeriya, Labado and Marla.SLA/MM added that the attacks are believed to take place today (21 May 08).To date at 11:00 hrs, the security situation in Muhajeriya was reportedly calm.SLA/MM forces are said to be mobilized and prepared to defend their positions.(JMAC observation 479) Duursma (2017) has shown that these types of early warning signals given by local populations can be used to predict future armed clashes.The reason why local information is a good source of early warning is that local communities often know best what is going on in their respective area.Armed groups do not operate in a social vacuum, but rather have many connections to locals (Kalyvas 2006).These social ties help locals understand when and where future armed violence is going to take place.That early warnings given by local populations are probably a strong predictor of armed violence is in line with a point made in the HIPPO: 'The best information [for peacekeepers] often comes from communities themselves.To use that information, missions must build relationships of trust with local people, leading to more effective delivery of protection of civilians mandates and better protection for peacekeepers' (UN 2015a: para 98).Yet, in order for the UN to benefit from local knowledge to the fullest extent, local early warnings need to be systematically collected and analyzed.
The JMAC data also includes observations that should, in theory, be strong predictors of specific types of armed violence in a given area.For instance, a third event type included in the JMAC dataset on Darfur is rebel splits, which can be expected to be a good predictor of clashes between different rebel groups.The following entry describes an example of a rebel split: The secretary general of the SLA/MM has reportedly dismissed the deputy secretary general for political affairs and chairperson for North Darfur.These dismissals have not been supported by the members of SLA/MM in South Darfur who accuse the secretary general of the movement of pursuing an NCP agenda.SLA/MM group is due to attend a workshop on the DPA in El Fasher soon.While SLA/MM is militarily engaged against its partner in the Darfur Peace Agreement, it shows worrying signs of internal dissentions that may lead to a break up of the movement.Some elements may decide to join other Zaghawa dominated groups, who were recently discussing a possible unification.(JMAC observation 1095) The JMAC data also includes observations on theft of livestock, which could be a strong predictor of communal violence.The following entry is an example of this fourth event type: 'On 06 and 7 March 2008 some people suspected to be Janjaweeds from Sharia and Hazajandid villages that came to the town (DAREL SALAM) and stole 29 sheep and 9 cows' (JMAC observation 150).
Fifth, and lastly, the JMAC dataset on Darfur also includes observations on peace events like the initiation of peace talks or even the conclusion of a local ceasefire.Previous research has found that armed violence in a given area is a strong predictor of future armed violence in the area (Costalli 2013;Duursma and Read 2017), but this might not be the case if a local peace deal has been concluded.An example of such a peace event is described in the following entry in the JMAC dataset on Darfur: Drawing on information on peace events might help to predict the termination of armed violence in a given area, as locally concluded ceasefires indicate an increased chance of a (temporary) lull in armed fighting.
The five types of events discussed above are by no means an exclusive list.Other examples of observations that can be found in peacekeeping data are political rallies, arrests, criminal activities, activities of humanitarian actors, peacekeeping patrols and obstruction of peacekeeping patrols.A major advantage of machine learning is that it is possible to tweak the algorithm, using test data, to determine which variables and/or which combination of variables are strong predictors of armed violence.While it remains to be examined to what extent predicting armed conflict is possible using UN peacekeeping data, it is reasonable to surmise that these new types of predictors can be leveraged to significantly improve conflict-prediction efforts on the subnational level.
It should be noted that while the information-collection effort by the JMAC units is already impressive, there are many sources of information in UN operations that are still not being collected and analyzed in a structured manner.Abilova and Novosseloff (2016: 22) note how JMAC analysts ' could coordinate not only military intelligence, police intelligence, humanitarian information, and political information but also information from social media monitoring, sanctions committees, groups of experts, and military observers who are mobile throughout the areas of operation.' In addition, as a result of the revolution in information technology, peace missions can now reach out to local civilians who use smartphones.JMAC staff can collect security information through crowdseeding, providing locals with a virtual place to send their observations, alerts and insights.In addition, the conflict parties themselves can be requested to participate in cooperative monitoring (Dorn 2016: 1).
The next section reflects on the potential overall impact of predictive analyses on the ability of the mission to implement its mandate.

Practical Implications: Translating Early Warning into Early Action
It is noted in the JMAC Field Handbook that early warning assessments within UN peacekeeping missions are aimed at predicting 'specific events to which the mission will have to react, providing a clear snapshot of a situation at a given time to anticipate escalatory development of a situation that potentially requires short-term preventive actions and long-term preventive measures' (UN DPKO and DFS 2018: 101).Using SAGE data for systematic data analysis, especially machine learning, can offer a great step forward in the predictive capability of the UN, and hopefully be translated into preventive action on the ground (see also Karlsrud 2014;Duursma 2017).The predictive analyses conducted to support UN peacekeeping missions could take the form of at least two specific outputs.A first output could be a risk map in which a color coding of administrative districts indicates the probability of events of interest like armed clashes between the main warring parties, communal violence or violence against civilians.A second output could be a function within SAGE that alerts all the relevant stakeholders within the UN when an event of interest (e.g., communal violence) may be about to happen in a given area (with a statistical plausibility level attached).The following sections sketch how these early warning tools could benefit peacekeeping efforts.

Identifying risks to the mission
Attacks on peacekeepers are unfortunately increasingly common.Salverda (2013) shows that 13 of the 24 UN peace missions deployed in civil wars between 1989 and 2003 experienced attacks by a rebel group.Since 2012, there has been a significant increase in fatalities in UN peacekeeping operations; however, this has mostly been driven by fatalities in MINUSMA (Henke 2016;Karlsrud 2018b).In addition, peacekeepers are not only attacked, they also often face more subtle types of resistance like intimidation or obstruction (Duursma 2018b).The risks peacekeepers and other UN personnel face necessitates intelligence that allows for early action to mitigate potential threats.This is also increasingly recognized within the UN.Intelligence was a taboo subject within the UN for a long time, but following the suicide bombing of a UN compound in Bagdad on 19 August 2003 -which killed at least 22 people, including the United Nations' Special Representative in Iraq -the use of intelligence became more acceptable (Dorn 2009;Norheim-Martinsen and Ravndal 2011).Martti Ahtisaari concluded in a report that assed the circumstances surrounding this bombing that: The UN security system failed adequately to analyze and utilize information made available to the system on threats against UN staff and premises.The security awareness within the country team did not match the hostile environment.The observance and implementation of security regulations and procedures were sloppy and non-compliance with security rules commonplace.(UN News 2003) A telling example of how data analysis could identify threats to missions is that the ASIFU in Mali picked up that the insurgents operating in Mali were starting to target MINUSMA's air assets.After the first few attempts to destroy air assets, ASIFU analysts recognized that air assets were targeted as part of a broader strategy.Several subsequent incidents confirmed this pattern.According to the ASIFU analysts, these attacks indicated that the armed groups opposing MINUSMA realized that MINUSMA could do very little without its air assets in a territory of operations as big as northern Mali.On the basis of this insight, MINUSMA could step up its protective measures of its air assets. 10This example illustrates that humans can of course also detect meaningful patterns.Yet, in theory, if data on the targeting of peacekeepers is rigorously recorded, then the potential to detect patterns in the threats peacekeepers face through machine learning is significant.
Moreover, there be might variables closely related to attacks on peacekeepers that can serve as early warning indicators.For instance, Fjelde et al. (2016) find that, on the country level, peacekeepers are very likely to be attacked by rebel groups when the balance of power turns against these rebels in their struggle against governments.If information analysts can uncover what explains violence against peacekeepers on the local level through machine learning, then peacekeepers can be alerted when there is a higher risk of being attacked.

Deploying troops
Another area in which early warning through systematic data analysis can be of great added value is the deployment of peacekeepers.Predicting where and when violence is most likely to take place allows the leadership of peacekeeping missions to deploy peacekeepers where they are most needed (Duursma and Read 2017).Peacekeeping missions are often criticized for their limited presence beyond headquarters and peacekeeping bases.Yet, the limited ability of peacekeepers to patrol in all sites of armed conflict is not that surprising given the vast regions in which armed fighting takes place.For example, almost half of all violent incidents in Darfur occur more than 100 kilometers from the nearest peacekeeping camp side -a distance that falls beyond the range of most peacekeeping patrols (UN 2014). 11 While it would be impossible to establish many more peacekeeping team sites throughout Darfur (which has an area roughly the size of France), it would be possible to project a presence at greater ranges from the camps in those areas identified as most at risk.Using data analysis tools that can predict conflict events in time and space could thus potentially help the leadership of peacekeeping missions in decisions about where to deploy their troops and which areas to send patrols.
Equally important to the question of where violence is likely to take place is the question of when violence is going to take place.Armed actors are strategic and can adapt to the presence of peacekeepers.For instance, the International Crisis Group reported how the presence of soldiers of the African Mission in Sudan led to a reduction in violence during the daytime, but resulted in a situation in which most violence in Darfur took place under the cover of darkness, particularly in the pre-dawn hours (International Crisis Group 2005: 7).A predictive analysis that indicates when attacks are most likely to take place would thus be very useful for the leadership of peacekeeping missions, as it can help them decide at what times patrols should be send out.

Conflict prevention efforts
Early warning can also be acted upon through a focused conflict prevention effort.Peace operations typically include a Civil Affairs component.As former Under-Secretary-General of DPKO Hervé Ladsous noted: 'Civil Affairs Officers play a key role in peacekeeping operations, and are an essential part of our 'peacekeeping toolkit', as we work with local communities and authorities to bring stability and help them build the foundations for lasting peace' (UN DPKO and DFS 2012: 4).Indeed, mediating peace talks between armed actors that operate in remote places far from any peacekeeping presence might be the only way to curb fighting in remote areas that are shown to be at increased risk through the analysis of peacekeeping data.Civil affairs officers would benefit greatly from an early warning tool that indicates where armed violence is mostly likely to erupt, as this helps them to anticipate rather than react to armed violence.This would then be very much a twoway street of information.Considering that Civil Affairs are concerned with building relationships on the ground, they will have access to early warnings issued by locals.In fact, ' conflict analysis, early warning, information-gathering, assessment of needs' is a crucial component of the work of civil affairs officers (UN DPKO and DFS 2012: 25).If civil affairs officers pass on these early warnings to JMAC analysts, the JMAC can use machine learning techniques to estimate the probability and potential impact of these local early warnings.This information can, in turn, be used by the civil affairs officers to focus their conflict prevention efforts.

Big Brain, but Little Hands? Ethical Considerations
Referring to the difficulty of translating early warning into early action, one ASIFU analyst described MINUSMA as 'having a big brain, but very little hands.' 12 Similarly, Edward C. Luck, the Special Adviser to Secretary-General Ban Ki Moon, has pointed out that early warning is not an end in itself: 'Early warning without early and effective action would only serve to reinforce stereotypes of UN fecklessness, of its penchant for words over deeds' (Luck 2010).In 2014, an internal UN Office of Internal Oversight report found that, in 507 attacks between 2010 and 2013, peacekeepers rarely used force to protect civilians under attack (UNGA 2014; see also Müller and Bashar 2017).The continued inability to turn early warning into early action is a constant frustration of practitioners and policy makers, leaving civilian populations in harm's way.Hence, while the use of data analysis tools to predict armed violence could greatly improve early action efforts, it is important to reflect on the practical limitations that can hinder UN staff with regard to translating this greater early warning capacity into early action.UN Secretary-General António Guterres has continued the proactive line vis-à-vis troop-contributing countries that Ban Ki-moon initiated at the end of his second term, when he fired the Kenyan Lt. Gen. Johnson Mogoa Kimani Ondieki after an inquiry into the attacks in July 2016 on UN staff in Juba, South Sudan (Curtis 2016).In July 2017, the UN repatriated 600 troops from the Republic of Congo after strong allegations of sexual exploitation and abuse (Whalan 2017).
It is not only in terms of operational hubris that machine learning may be a doubleedged sword.The data lifecycle contains numerous risks in terms of putting informants in danger.The collection and storage of sensitive data therefore necessitates increasingly strong rules and routines for the storage and management of data and information.How and for how long will the information be stored, who will have access to it and what types of security measures will be taken at all levels to ensure the integrity and safety of the data?
Privacy concerns will be central -civilians who are already at risk can face new threats if their personal information is disclosed or reidentified.A telling example in this regard is the confidential relationship UN information analysts have with local informants.Local informants often play a crucial role in the day-to-day work of information analysts, as they can provide information that is very hard to get from other sources (Duursma 2018a).However, this valuable information comes at a price, as conflict parties may target individuals that are suspected to have passed on information to the UN in order to dissuade other potential informants from also sharing information (see, for example, Kalyvas 2006).The tricky issue is thus not simply about confidentiality, but also about the risks of maintaining trust and anonymous relationships.Indeed, a UN information analyst working for MINUSMA in Mali reflects how the al-Qaeda in the Islamic Maghreb has circulated death lists with alleged local informants. 13This suggests that, in certain contexts, providing information to the UN can get informants killed.If, for some reason, UN data on local informants is leaked, the consequences can be catastrophic.The UN has already become the target of offensive cyber attacks.Successful cyber attacks could retrieve data and potentially expose the names of many UN informants.
Not only informants, but also ' ordinary' citizens may be at risk because of UN data collection efforts.For instance, data gathered on a general level may expose ethnic or other groups to retributive action if it falls into the wrong hands.These are serious risks that the UN is already faced with (Scott-Railton 2013; Scott-Railton and Marquis-Boire 2013; Nyst 2014; for initial responses see, for example, Gilman and Baker 2014).An example is the collection of biometric data from all Syrian refugees -a practice that the UN High Commissioner for Refugees began in 2010 -used to create a ' cross-border identity' for refugees in Syria and neighboring countries (Jacobsen 2015).
In addition, there have to be checks and balances built into the use of predictive analysis.Algorithms could be biased, e.g., directing conflict prevention responses to the wrong locations.As Bennett Moses and Chan (2016: 806) observe with regard to predictive policing: 'The algorithms used to gain predictive insights build on assumptions about accuracy, continuity, the irrelevance of omitted variables, and the primary importance of particular information (such as location) over others.In making decisions based on these algorithms, police are also directed towards particular kinds of decisions and responses to the exclusion of others.'In other words, it is possible that a lack of data collected in a given area means that a predictive analysis will fail to identify the increased likelihood of armed violence in this area.This would then also mean that the leadership of the peacekeeping missions would not divert adequate resources to manage the possible armed violence in this area.It should, however, be noted that this pitfall of data-driven peacekeeping is not very different from qualitative early warning assessments by information analysts, which can also be biased.In fact, the systematic analysis of peacekeeping data allows for an evaluation of predictions, which, in theory, can help information analysts to uncover any biases in the data collection.
Expectations also need to be managed with regard to the potential of predictive peacekeeping.When the police in some cities around the world started to engage in predictive policing, the media was quick to conclude that predictive algorithms ' can in theory predict, with pinpoint accuracy, where criminal offences are most likely to happen on any given day' (Bennett Moses and Chan 2016: 807).Yet, this is a highly unrealistic view of the actual capacity of predictive policing.No predictive policing tool currently available offers 'pinpoint accuracy' in terms of predicting events in time and space, but rather larger blocks (such as street sections) and weeks or months.In addition, not all criminal offenses are equally suitable for forecasting (Hart and Zandbergen 2012;Bennett Moses and Chan 2016).The same will be true for predictive peacekeeping.Rather than pinpointing exact locations of future armed violence, predictive peacekeeping will probably use grid cells of around 40 by 40 kilometers, localities or settlements as units of analysis (Duursma 2017;Cederman and Weidmann 2017).
Finally, technological advances improve the ability to understand the operational situation from afar, and for affected populations to relay their needs (Meier 2015).However, this increases the risk of remote management accordingly, and lessens the ability to interact, understand and empathize with local populations, which, after all, are those UN peace operations should be most accountable to.The UN has been criticized for an increasing tendency of 'bunkerization' -retreating behind the safe confines of high walls and Hesco barriers (Duffield 2013).Through innovation and simulation, technology can replace ground truth, premised on an 'uncritical technological-determinist vision of modulating the moods, expectations and actions of remote disaster-affected populations' (ibid.: 4; see also Sandvik et al. 2014).

Conclusion
Machine learning, a ubiquitous feature of our societies, is about to reach UN peace operations.The establishment of the SAGE system has enabled the use of predictive learning in UN peace operations.Data is being gathered, categorized and stored, and can be analyzed using machine learning techniques.This is a positive development, enabling preventive deployment to protect civilians and staff alike.However, it will require resources as well as careful thinking about the potential pitfalls that practitioners and policy makers will be confronted with.
Technological progress will nevertheless continue with unabated strength.By applying data analysis tools to SAGE data, it will also be possible to identify patterns that are not currently apparent.This could lead to new and innovative methods and tools for protection.Different types of interventions could be tested -military, police and civilian patrols, civil affairs mediation, use of surveillance UAVs, etc.
A future research agenda thus needs to consider the implications of current and future developments in this area.What ethical challenges will be faced?How do these technological advances change the way peace operations are managed and held accountable, to member states and affected populations on the ground?How can it be ensured that improved information indeed leads to improved action?Where and in what tools should member states invest to support these technological developments in the best possible manner?This article has outlined answers to some of these questions, but much work remains.
The SAGE tool collects a range of data, but is only one of many tools that the UN is implementing to improve data collection, analysis and, ultimately, its performance.In 2018, the UN started employing the comprehensive performance assessment system (CPAS) 'to assess whole-of-mission performance -civilian and uniformed components, staff and leadership -through data collection and analysis' (Lacroix 2018).The CPAS was rolled out in MINUSCA, UNMISS and UNIFIL in 2018, and will be rolled out in all missions by 2020 (ibid.),It is intended to improve the assessment of all dimensions of the work UN peacekeeping missions are doing, including in the political and substantive dimensions.More and better data can help improve performance in a range of areas, including the assessment of the intended and unintended economic, environmental and political impacts of missions on host societies (Ammitzbøll and Tychsen 2007; Ernst et al. 2014;Martin-Shields and Bodanac 2018;de Coning and Brusset 2018).Cederman and Weidmann (2017: 476) warn that researchers should not be overly optimistic about using machine learning to predict the onset of civil wars over a longer time frame, but note that 'forecasts with much more limited spatial and temporal scope -such as projected short-term trajectories of violence in a given city in an ongoing civil war' are perfectly within the realm of possibilities.The high-quality data collected within the context of UN peacekeeping missions has the potential to significantly improve the type of conflict prediction efforts that are limited in space and time.UN peacekeeping data is comprehensive, increasingly precise and, crucially, includes observations that, from a theoretical point of view, should be strong predictors of armed violence on the subnational level.Indeed, the recent turn to collecting high-quality data in a systematic manner is likely to enable the UN to identify opportunities for preventive action, thus increasing the effectiveness of UN peacekeeping operations.

Notes
1 The JMAC data has been collected between 3 January 2008 and 31 August 2009, in real time, to support the day-today operations of UNAMID. 2 We will return to SAGE in greater detail in the next section. 3It should be noted upfront that the adoption of technology does not automatically make the UN fit for purpose.The positive impact of technology hinges on whether it helps the UN to be people-centered (de Coning et al. 2015;Karlsrud 2018a).The final section of this article will therefore reflect on the conditions that determine whether predictive analysis can help the UN be more effective. 4Ushahidi is a web-based reporting system that utilizes crowdsourced data to formulate visual map information of a crisis on a real-time basis.The data can be provided via text messages, email, Twitter and web forms.'Ushahidi which means "testimony" in Swahili, was a website that was initially developed to map reports of violence in Kenya after the post-election fallout at the beginning of 2008' (Ushahidi 2018). 5SAGE is also based on the Ushahidi platform. 6Of particular relevance, the UN Operations and Crisis Centre (UNOCC) has now been fully integrated into this section to 'provide United Nations senior managers with a global operational picture' (UN 2017c).Other key units are the senior advisor on policy who should '[e] nsure that fresh thinking and outside perspectives are introduced into the policymaking process,' the Strategic Planning and Monitoring Unit and the Political, Peacekeeping, Humanitarian and Human Rights Unit (UN 2017d). 7Digitization is the process of making data digital, while digitalization is the process