Data Analytics in Policing

Written by Dan Mount, Head of Policy and Public Affairs at Civic Agenda

Case Study: the Odyssey Project

The Odyssey Project was an EU-funded initiative designed to facilitate the interrogation, manipulation and sharing of data about ballistic gun crime across different European police forces. Key project partners included Sheffield Hallam University (UK), Atos Origin (Spain), EUROPOL (Netherlands), SAS Software Ltd (UK), alongside police forces in the UK, Ireland and Italy.

The project’s objective was to develop a prototype platform which enabled the automated combining of ballistics and firearms data from a broad range of sources across different national jurisdictions for cross-correlation and analysis. This approach was conceived to help circumvent the traditional obstacles affecting the investigation of trans-border gun crime where discovery and access to data held by different national agencies can be notoriously difficult. For example, without the existence of integrated pan-European ballistics intelligence information system there is no way of electronically sharing key information (such as forensic laboratory data or photographs of cartridge casings). In this context, forensic comparison of a single bullet could only be achieved by physically transporting that bullet to different national agency locations at an average cost of around €9,000 including travel, accommodation, equipment and logistics.

During the initial stages of the Odyssey prototype it was expected that the greatest challenges would be around developing the technical architecture and solutions to aggregate and share different data sources. In reality these technical challenges proved to be relatively straightforward. The primary hurdle facing the project was how to persuade different agencies to work together in a context where there were no common standards for the collection of ballistic crime data, and different jurisdictions had divergent legal frameworks in terms of the limitations imposed on the amount and types of data that could be shared across borders.

Alongside the difficulties in designing an IT system which satisfied the expectations of such a broad range of international users – further issues were encountered in persuading national police forces to allocate proportionate staff time to entering relevant data onto the system. In addition, it was essential to highlight the distinction between intelligence and evidence. While the system was capable of suggesting ballistic matches based on probability analysis – you still needed someone to stare down a microscope to examine these potential matches to confirm which of them was correct.

Building the case for cross-border data sharing and analytics

There is a strong business case for expanding and promoting the number of success stories associated with cross-border data sharing, analytics and policing. One example is a recent case where it was identified that a number of NATO member states in Eastern Europe were dumping military shell casings on the copper market. This additional copper supply was being used by criminals to manufacture illegal ammunition casings which were then being used to commit gun crimes in other parts of Europe and the UK. As a result, police investigators were able to engage with those NATO countries to request that they cut off the supply of illegal ammunition materials. A further example is the case of a motorway marksman who was firing bullets into luxury cars as they were being transported on a lorry travelling along the German autobahn. These bullets casings were then embedded in the vehicles, and only discovered by police when they reached their final destination in Spain. Collaboration and data sharing between German and Spanish police forces was therefore essential in order to successfully identify the perpetrator. Both these case studies were used to firm up the initial business case behind the EU-funded Odyssey Project.

The evolution of predictive analytics

Twenty years ago predictive analytics was a resource intensive enterprise, making it the exclusive preserve of governments and large corporations. Now we are seeing a large range of organisations of varying sizes collecting and analysing substantial quantities of data. The professional and organisational challenges associated with predictive analytics have been progressively reduced through both technological advances and the profit driver which increasingly enables organisations to unify resources to prosecute key business objectives. In addition, the profit motive is a powerful incentive for private sector firms to maintain their data – for example, Experian imposes a number of information reporting duties on the credit bureaus which subscribe to its services.

One of the main developments in predictive analytics in the last two decades is that analysis now often takes place in real time. For example, banks are able to revise and update their lending models based on real time market information and customer profiling. According to Gartner, big data can be defined by three “V’s” – volume, velocity and variety.[1] Since then others have suggested a further three “V’s” be added to this equation – veracity, value and vulnerability.[2] Further commentators have argued that the overarching and defining feature of big data is its ubiquity.

Key challenges for government

When we look at the potential use of big data and predictive analytics across government we encounter significant restrictions on the use and sharing of data given that relevant information tends to be fragmented across departmental and agency silos. In 2008 the Review of Criminality Information led by Sir Ian Magee featured a map of the disparate institutional geography and relationships which form the essence of this challenge in the realm of public protection.[3] The diagram aptly demonstrates that accountability and responsibility is split across five different government departments, alongside an alphabet soup of agencies, committees and inspectorates.

Another key challenge for government is that a lot of information is stored in physical data centres which can’t currently be joined up. It should also be noted that Cabinet Office published a new Protective Marking/Security Classifications Scheme[4] in 2013 which lacks any deadlines for compliance or guidelines for agency implementation.

Focusing on local partnerships

There are many notable examples of effective data sharing initiatives between a range of actors involving in policing, education and health at local authority level. For example, IBM is working with Camden Council and has joined up 16 previously separate information systems (including youth justice) in a three month timeframe. The project[5] was to develop a “Residents’ Index” uniting information from multiple services to create a single consistent view of residents across the borough and the council services they are using. However, once such initiatives seek to replicate their success at a regional or national level this process becomes significantly more difficult. Then again, given that 80% of services are delivered at local government level perhaps the key question should be how we should work to empower local champions to forge effective data sharing partnerships.

A choice of approaches – overt or covert analytics?

A central question for any new predictive analytics initiative is – should the results be used internally or shared with the public? For example, if big data analytics of NHS records shows a 15 year life expectancy difference across adjacent post codes – there could be negative consequences associated with making that information public. On the other hand, in the case of tracking number plates of stolen cars on the motorway – a key element of this approach is that criminals need to be aware that law enforcement has this capability to serve as a deterrent for vehicle theft. The choice between whether analytics should take place in a public or in an internal organisational context needs to be decided from the very start of any project as this decision has significant implications for how such initiatives need to be positioned in messaging/PR terms.

Common standards for data collection and maintenance

The Government Digital Service (GDS) is currently working on the standardisation of data collection and maintenance across multiple departmental systems and services which currently have no common standards (see the GDS Government Service Design Manual[6]). As government moves towards a platform approach to delivering its services it is likely that common standards will progressively emerge. However, in most instances trans-national standards/solutions are probably many years in the future. The Icelandic government has adopted a policy whereby once a government department or agency has requested a piece of personal information from a citizen no other part of government is able to issue a repeat request for the same information. This kind of approach would prevent the familiar syndrome of government departments issuing automated letters to deceased individuals despite that information being available elsewhere within government (or potentially elsewhere in that department).

Predictive analytics and policing

The full potential of predictive analytics in policing is not just to react better to existing problems or prevent past problems reoccurring. Ideally it should use data to anticipate the problems of the future. This involves identifying trends which are not apparent to a human brain, but can be identified by a machine. At the same time, it is also true that before realising the benefits of full scale predictive analytics, there are significant early wins to be achieved through effective joining up police data which already exists. For example, in the UK the National Ballistics Intelligence Service (NABIS) offers a national database for all recovered firearms and ballistic material (including ammunition rounds, shell casings and projectiles). At the same time the Police ICT Company Directorate also administers the National Firearms Licensing Management System (NFLMS) which is a register of all individuals who have applied for or been granted a certificate for a firearm or shotgun. Given that a significant percentage of gun crime is currently committed using licensed firearms there would be substantial value in combining and comparing the information on these two databases. There are also further quick wins to be achieved by allocating resources more efficiently against emerging and evolving patterns of criminality.

Police National Database

In June 2011 the National Policing Improvement Agency (NPIA) launched the Police National Database (PND) which contains details of an estimated 15 million UK citizens which are available to 53 police forces across England, Wales and Scotland.[7] The PND database also includes over 2 billion POLE records[8] (information categorised by people, objects, locations and events) gathered from across the police sector. It is worth noting that this is a far more comprehensive approach than the closest equivalent system in the United States which is only used by approximately 5% of 40,000 US police forces. In addition, in a context where resources can be polarised between the National Crime Agency at national level and Police & Crime Commissioners at local level – the PND is a helpful resource to support policing at regional level. In order to decrease the barriers to adoption and use, the PND user interface was developed to look like Google search and Google maps in order to benefit from police staff familiarity with using those commercial interfaces at home. Finally, it is important to remember that in certain cases law enforcement data differs from traditional data in that criminals have a vested interest in submitting inaccurate or false information.

The automation of manual police processes versus predictive analytics

Much of the technological adoption and adaptation across law enforcement has mostly been focused on automating previously manual processes. For example, before the Serious Organised Crime Agency was absorbed into the NCA, 80% of its analysts were carrying out relatively low level analytical tasks which were ripe for automation. True predictive analytics involves substantially more than this. And yet it is also important that to prevent a cultural backlash it needs to be made clear that the primary objective of predictive analytics is not to replace police officers. Instead, its purpose is to supplement their years of experience and local knowledge with additional tools and information to help prevent crime and protect communities. At all times an effective balance will need to be struck between traditional policing methods and new technology enabled approaches. We need to acknowledge that many police officers will still want to acquire offline confirmation of digital data (e.g. speak directly to someone in the community) before taking action based on a digital policing tool. A further challenge exists around persuading local police forces to sacrifice their operational independence and accountability by allowing joint ownership of investigations and the data relating to those investigations.

The prioritisation of arrests over prevention

Across law enforcement there is often an institutional focus on arrest volumes at the expense of achieving preventative outcomes. This is partially because arresting 100 criminals is always a more tangible achievement than claiming to have prevented 1,000 potential crimes. In other words, how can you prove that you have prevented a crime that never actually happened? At local level, Police and Crime Commissioners are usually only in post for between 1-3 years, so from a short term political perspective, longer term prevention is bound to be seen as less important than immediate results which can used to cement their case for re-election.

In terms of public perceptions, burglary is a source of far greater concern than firearm offenses – which are far lower in volume, but arguably still extremely important given their potential for loss of life. However, predictive analytics can play an equally valuable role in mitigating high volume crimes like burglary. For example, when data analysis reveals that houses within a 3 mile radius of a previously burgled property are 300% more likely to be broken into during the first three months after the initial crime was committed.

Conclusions and closing comments

The role of federated approaches to identity

It was suggested that there was substantial scope for law enforcement to build upon the example set by GDS in terms of implementing open data strategies and open Application Programming Interfaces (APIs). This is setting new standards of expectation across government departments and agencies, spearheaded by the Government Service Design Manual. In particular the use of a federated approach towards authenticating the identity of those accessing digital government services was seen as particularly valuable. This is exemplified by the public beta launch of the GOV.UK Verify service which allows certified companies to verify the identity of service users instead of relying on a single government data base.

Statistical inferences versus evidence based facts

It is important to remember that data analytics produce equivocal results, and that the most effective kind of profiling depends upon deductive as opposed to inductive reasoning. There can be a risk of automation bias, whereby the assumption is made that whatever an algorithm says must be the truth. The reality is that care must always be taken in police work and elsewhere to distinguish between statistical inferences and evidence based facts.

People and businesses think this is already happening

It was remarked that there was a surprising level of similarity to the challenges faced by law enforcement in joining up different data sources and conducting effective analytics, to the challenges faced across the wider institutional geography of government departments and agencies. It was also commented that there appears to be a natural inclination across public audiences to see government as the primary tracker of people’s information journeys. In other words, many individuals and businesses assume that law enforcement and government agencies are already capable of conducting data sharing and analytics on far greater scale than currently exists. In reality the public protection network map is extremely fragmented – which suggests steps should be taken to increase public awareness of these issues and the business case for addressing them more successfully.

The overarching importance of cultural and institutional change

It is clear that technology enables new desirable outcomes which can only be achieved alongside significant cultural and institutional change. Indeed, it would appear that the primary barriers to successful data sharing and effective analytics are not going to be around developing the necessary technical solutions – but will manifest themselves in the form of resistant institutional cultures, incompatible legal and regulatory frameworks, alongside deficiencies in digital literacy and information management skills. We need to incentivise the adoption of a wider holistic and strategic mind-set in responding to these challenges as opposed to a series of granular tactical approaches. We need to consider solutions which strike an efficient balance between transparency, openness, accountability and privacy whilst maximising the public benefits that these new technological approaches have to offer. 


[1] Gartner’s IT Glossary identifies volume, velocity and volume as key aspects of big data

[2] NIST Big Data Definitions and Taxonomies Version 1.0, August 2013, page 4

[3] Review of Criminality Information, Sir Ian Magee, 2008, page 26

[4] Government Security Classifications April 2014, published by Cabinet Office in April 2013

[5] Driving cost-effective services by gaining a single view of how the council engages with residents, IBM 2014, accessed from the IBM website on 23rd February 2015

[6] Government Service Design Manual, Government Digital Service website accessed 23rd February 2015

[7] Police National Database launched containing details of 15 million UK citizens, Computer Weekly, 22nd June 2011

[8] POLE Data Warehouse, G-Cloud Service Definition, CGI, September 2013, page 6

 

Dan Mount is Head of Policy and Public Affairs at Civic Agenda.

Comments are closed.