Not all evidence is created equally

In the policy world, not all evidence is created equally.

I’m not talking about forensic or criminal evidence (though those areas have hierarchies too). I’m referring to the evidence we need to make a good policy choice. Policy decisions in policing include if my police department should support a second responder program to prevent domestic abuse, or should dedicate officers to teach D.A.R.E. in our local school? (The answer to both questions is no). The evidence I’m talking about here is not just ‘what works’ to reduce crime significantly, but also how it works (what mechanism is taking place), and when it works (in what contexts the tactic may be effective).

We can harvest ideas about what might work from a variety of sources. Telep and Lum found that a library database was the least accessed source for officers when learning about what tactics might work[1]. That might be because for most police officers, a deep dive into the academic literature is like being asked to do foot patrol in Hades while wearing a nylon ballistic vest. But it also means that the ‘what works’, ‘how works’, and ‘when works’ of crime reduction are never fully understood. Too many cops unfortunately rely on their intuition, opinion or unreliable sources, as noted by Ken Pease and Jason Roach:

Police officers may be persuaded by the experience of other officers, but seldom by academic research, however extensive and sophisticated. Collegiality among police officers is an enduring feature of police culture. Most officers are not aware of, are not taught about and choose not to seek out relevant academic research. When launching local initiatives, their first action tends to be the arrangement of visits to similar initiatives in other forces, rather than taking to the journals.[2]

In fact, not only do officers favor information from other cops, but it also has to be from the right police officers. Just about everyone in policing at some point has been told to forget what they learned at the police academy. That was certainly my experience. And those courses were usually taught by experienced police officers! It’s too easy to end up with a very narrow and restricted field of experience on which to draw.

Fortunately, while a basic understanding of different research qualities is helpful, you do not need to have an advanced degree in research methodology to be able to implement evidence-based policing from a range of evidence sources. It’s sufficient to appreciate that there is a hierarchy of evidence in which to place your trust, and to have a rudimentary understanding of the differences between them. I’m not dismissing any forms of research, but I am saying that some research is more reliable than others and more useful for operational decision-making. For example internal department research may not be great for choosing between prevention strategy options, but it is hugely useful for identifying the range of problems.

There are a number of hierarchies of evidence available. Criminologists are familiar with the Maryland Scientific Methods Scale devised by Larry Sherman and colleagues[3]. In this scale, studies are ranked from 1 (weak) to 5 (strong) based on the study’s capacity to show causality and limit the effects of selection bias. Selection bias occurs when researchers choose participants in a study rather than allow random selection. It’s not necessarily an intent to harm the study, but it is a common problem. When we select participants or places to be part of a study, we can unconsciously choose places that will perform better than randomly selected sites. We cherry-pick the subjects, and so bias the study and get the result we want. It’s why so many supposedly great projects have been difficult to replicate.

But the Maryland scale only addresses academic research, and police officers get knowledge and information from a range of sources. Rob Briner’s work in evidence-based management also stresses this. Go into any police canteen or break room and you hear anecdote after anecdote (I wrote a few weeks ago about the challenges of ‘experience’). These examples of homespun wisdom—also known as carefully selected case studies—are often illustrative of an unusual case and not a story of the mundane and ordinary that we deal with every day.

This professional expertise can also be supplemented by the stakeholder concerns of the local community (including the public or the local enforcement community such as other police departments or prosecutors). Knowing what is important to stakeholders and innovative approaches adopted by colleagues is useful in the hunt for a solution to a crime or disorder problem.

Police also get information from experts and internal reports. In larger forces, reports from statistics or crime analysis units can be important sources of information. Organizational data are therefore useful to officers who try and replicate crime reduction that a colleague in a different district appears to have achieved. All of these sources are important to some police officers, and they deserve a place on the hierarchy. But because they are hand selected, can be abnormal, might be influenced internally, or have not been subjected to a lot of scrutiny, they get a lower place on the chart.

In the figure on this page, I’ve pulled together a hierarchy of evidence from a variety of sources and tried to combine them in such a way that you can appreciate different sources and their benefits and concerns. And hopefully this might help you appreciate a little how to interpret the evidence from sources such as The National Institute of Justice’s CrimeSolutions.gov website and The UK College of Policing’s Crime Reduction Toolkit. In a later post I’ll try and expand on each of the levels (from 0 to 5*).

 

  1. Telep, C.W. and C. Lum, The receptivity of officers to empirical research and evidence-based policing: An examination of survey data from three agencies. Police Quarterly, 2014. 17(4): p. 359-385.
  2. Pease, K. and J. Roach, How to morph experience into evidence in Advances in Evidence-Based Policing, J. Knutsson and L. Tompson, Editors. 2017, Routledge. p. 84-97.
  3. Sherman, L.W., et al., Preventing Crime: What works, what doesn’t, what’s promising. 1998, National Institute of Justice: Washington DC.

It’s time for Compstat to change

If we are to promote more thoughtful and evidence-based policing, then Compstat has to change. The Compstat-type crime management meeting has its origins in Bill Bratton’s need to extract greater accountability from NYPD precinct commanders in late 1990s New York. It was definitely innovative for policing at the time, and instigated many initiatives that are hugely beneficial to modern policing (such as the growth of crime mapping). And arguably it has been successful in promoting greater reflexivity from middle managers; however these days the flaws are increasingly apparent.

Over my years of watching Compstat-type meetings in a number of departments, I’ve observed everyone settle into their Compstat role relatively comfortably. Well almost. The mid-level local area commander who has to field questions is often a little uneasy, but these days few careers are destroyed in Compstat. A little preparation, some confidence, and a handful of quick statistics or case details to bullshit through the tough parts will all see a shrewd commander escape unscathed.

In turn, the executives know their role. They stare intently at the map, ask about a crime hot spot or two, perhaps interrogate a little on a case just to check the commander has some specifics on hand, and then volunteer thoughts on a strategy the commander should try—just to demonstrate their experience. It’s an easy role because it doesn’t require any preparation. In turn, the area commander pledges to increase patrols in the neighborhood and everyone commits to reviewing progress next month, safe in the knowledge that little review will actually take place because by then new dots will have appeared on the map to absorb everyone’s attention. It’s a one-trick pony and everyone is comfortable with the trick.

There are some glaring problems with Compstat. The first is that the analysis is weak and often just based on a map of dots or, if the department is adventurous, crime hot spots. Unfortunately, a map of crime hot spots should be the start of an analysis, not the conclusion. It’s great for telling us what is going on, but this sort of map can’t really tell us why. We need more information and intelligence to get to why. And why is vital if we are to implement a successful crime reduction strategy.

We never get beyond this basic map because of the second problem: the frequent push to make an operational decision immediately. When command staff have to magic up a response on the spot, the result is often a superficial operational choice. Nobody wants to appear indecisive, but with crime control it can be disastrous. Too few commanders ever request more time to do more analysis, or time to consider the evidence base for their operational strategies. It’s as if asking to think more about a complex problem would be seen as weak or too ‘clever’. I concede that tackling an emerging crime spike might be valuable (though they often regress to the mean, or as Sir Francis Galton called it in 1886, regression towards mediocrity). Many Compstat issues however, revolve around chronic, long-term problems where a few days isn’t going to make much difference. We should adopt the attitude that it’s better to have a thoughtfully considered successful strategy next week than a failing one this week.

Because of the pressure to miracle a working strategy out of thin air, area commanders usually default to a limited set of standard approaches, saturation patrol with uniform resources being the one that I see at least 90 percent of the time. And it’s applied to everything, regardless of whether there is any likelihood that it will impact the problem. It is suggested by executives and embraced by local area commanders because it is how we’ve always escaped from Compstat. Few question saturation patrols, there is some evidence it works in the short term, and it’s a non-threatening traditional policing approach that everyone understands. Saturation patrol is like a favorite winter coat, except that we like to wear it all year round.

Third, in the absence of a more thoughtful and evidence-based process, too many decisions and views lack any evidential support and instead are driven by personal views. There is a scene in the movie Moneyball where all the old baseball scouts are giving their thoughts on which players the team should buy, based only on the scouts’ experience, opinion and personal judgment. They ignore the nerd in the corner who has real data and figures … and some insight. They even question if he has to be in the room. In the movie, the data analyst is disparaged, even though he doesn’t bring an opinion or intuition to the table. He brings data analysis, and the data don’t care how long you have been in the business.

Too many Compstat meetings are reminiscent of this scene. The centerpiece of many Compstat meetings is a map of crime that many are viewing for the first time. A room full of people wax lyrical on the crime problem based on their intuitive interpretation of a map of crime on the wall, and then they promote solutions for our beleaguered commander, based too often on opinion and personal judgement and too little on knowledge of the supporting evidence of the tactic’s effectiveness. Because everyone knows they have to come back in a month the strategies are inevitably short-term in nature and never evaluated. And without being evaluated, they are never discredited, so they become the go-to tactical choice ad infinitum.

So the problems with Compstat are weak analysis, rushed decision-making, and opinion-driven strategies. What might the solutions be?

The U.K.’s National Intelligence Model is a good starting point for consideration. It has a strategic and a tactical cycle. The strategic meeting attendees determine the main strategic aims and goals for the district. At a recent meeting a senior commander told me “We are usually too busy putting out fires to care about who is throwing matches around.” Any process that has some strategic direction to focus the tactical day-to-day management of a district has the capacity to keep at least one eye on the match thrower. A monthly meeting, focused on chronic district problems, can generate two or three strategic priorities.

A more regular tactical meeting is then tasked with implementing these strategic priorities. This might be a weekly meeting that can both deal with the dramas of the day as well as supervise implementation of the goals set at the strategic meeting. It is important that the tactical meeting should spend some time on the implementation of the larger strategic goals. In this way, the strategic goals are not subsumed by day-to-day dramas that often comprise the tyranny of the moment. And the tactical meeting shouldn’t set strategic goals—that is the role of the strategic working group.

I’ve previously written that Compstat has become a game of “whack-a-mole” policing with no long-term value. Dots appear, and we move the troops to the dots to try and quell the problem. Next month new dots appear somewhere else, and we do the whole thing all over again. If we don’t retain a strategic eye on long-term goals, it’s not effective policing. It’s Groundhog Day policing.

What role for experience in evidence-based policing?

I recently received an illustrative lesson in the challenges of evidence-based policing. I was asked to sit in on a meeting where a number of senior managers were pitching an idea to their commander. It required the redistribution of patrols, and they were armed with evidence that the existing beats were not in the best locations and so were not as effective as they could be. The commander sat back in his chair and said “so I have to move some of those patrols?” Yes, the area managers responded, presenting a cogent yet measured response based on a thorough data analysis supported with current academic research. The commander replied “well in my experience, they are being effective so I am not going to move them”. And at that point the meeting ended. Experience trumped data and evidence, as it often does.

All the evidence available suggested that the commander made a poor decision. When I was learning to be a pilot I heard an old flying aphorism about decisions. Good decisions come from experience, and experience comes from bad decisions[1]. Unfortunately, the profession of policing cannot lurch from an endless cycle of bad decisions as new recruits enter policing and learn the business. The modern intolerance for honest mistakes and anything-but-perfection precludes this. It’s also expensive and dangerous. Therefore how do we develop and grow a culture of good decisions? And what is the role of experience?

In praise of experience

Policing is unique in the liberal level of discretion and absence of supervision given to the least experienced officers. As a teenager when I started patrol, I can testify to the steep learning curve on entering the job. Experiences come at you thick and fast. In some ways, we learn from these experiences. Most cops absorb pretty quickly how to speak with people who are drunk or having a behavioral health crisis in a way that doesn’t end up in them rolling around in the street with fists flying. My colleagues demonstrated a style and tone and I learned from the experience they had gained over time.

We should also recognize that the evidentiary foundation for much of policing is pretty thin. We simply do not yet know much about what works and what good practice looks like. It’s not as if we have an extensive knowledge bank with which to replace experience. In recognition of this, the UK College of Policing note that “Where there is little or no formal research, other evidence such as professional consensus and peer review, may be regarded as the ‘best available’”. So practitioner judgement may help fill a void until that time when we have more research across a wider variety of policing topics. In time, this research will help officers achieve better practice. In the meantime, shared experience may be of value, if (in the words of the UK College of Policing) “gathered and documented in a careful and transparent way”.

Finally, personal intuition and opinion may not be a sound basis on which to make policy, but sometimes it can offer insights in rarely studied areas. This can prompt new ways of looking at problems. By varying experience, we can learn new ways to deal with issues. These new ways could then be tested more formally. There is definitely a place for personal judgement in the craft of policing. But the current reliance on it prevents us embracing a culture of curiosity and developing that evidence base[2]. And personal experience has other limitations.

A critique of experience

Like the commander at the start of this section, unfortunately, most police leaders don’t make decisions using the best evidence available. They overwhelmingly prefer decisions that are entrenched in their personal experience. The problem is that everyone’s experience is limited (we can’t have been everywhere and dealt with every type of incident), and in policing we receive too little feedback to actually learn many lessons.

What do I mean? As a young cop, I attended countless domestic disturbance calls armed with so little personal experience in long-term relationships it was laughable. It soon because clear that the measure of ‘success’ (against which ‘experience’ was judged) was if we got a call back to that address during that shift. If we did, I had failed. If we didn’t, I had succeeded and was on my way to gaining the moniker of ‘experienced’.

But what if the husband beat his partner to within an inch of her life within hours of my going home? Or the next week? If our shift wasn’t on duty I would never learn that my mediation and resolution attempts had been unsuccessful or worse, harmful. I would never receive important feedback and would continue to deal with domestic disturbance calls in the same way. Absent supervision and feedback, not only would I continue to act in a harmful manner, worse, my colleagues and I might think I was now experienced. They might prioritize my attendance at these calls, and perhaps eventually give me a field training role. My bad practice would now become established ‘good’ practice.

As others have noted “personal judgment alone is not a very reliable source of evidence because it is highly susceptible to systematic errors – cognitive and information-processing limits make us prone to biases that have negative effects on the quality of the decisions we make”. Even experienced police officers are not great at identifying crime hot spots[3] and do not do as well as a computer algorithm[4]. This isn’t just an issue for policing. Experts, many with many years of experience, are often poor at making forecast across a range of businesses and professions. The doctors that continued to engage in blood-letting into the latter half of the 19th century weren’t being callous. They probably had good intentions. But their well-meaning embrace of personal judgement, tradition and supposed best practice (probably learned from a medical guru) killed people.

I frequently conduct training on evidence-based and intelligence-led policing. I often run a quick test. I show officers a range of crime prevention interventions and ask which are effective. It’s rare to find anyone who can get the correct answer, and most folk are wildly off target. It’s just for fun, but illustrates how training and education in policing still remains at odds with a core activity, the reduction in crime.

A role for professional experience?

As Barends and colleagues note[5], “Different from intuition, opinion or belief, professional experience is accumulated over time through reflection on the outcomes of similar actions taken in similar situations.” It differs from personal experience because professional experience aggregates the knowledge of a variety of practitioners. It also emerges from explicit reflection on the outcomes of actions.

This explicit reflection requires feedback. When I was learning to fly, a jarring sensation and the sound of the instructor wince was the immediate feedback I needed to tell me I had not landed as smoothly as I had hoped. But flying around the traffic pattern, I immediately had another chance to prove improvement and a lesson learned. This type of immediate feedback and opportunity to improve is rare in policing. The radio has already dragged us to a different call.

For many enforcement applications, a research evaluation is essential to provide the kind of feedback that you can’t get from personal observation. The research on directed patrol for gun violence is a good example of how research evidence can improve strategy and increase public safety. Science and evaluation can replicate the experiences of hundreds of practitioners and pool that wisdom. While you can walk a single foot beat and think foot patrol is waste of time, the aggregate experiences and data from 240 officers across 60 beats tells us differently.

Tapping into scientific research findings and available organizational data (such as crime hot spot maps) and temporal charts, will enhance our professional experience. Being open to the possibility that our intuition and personal opinion may be flawed is also important, though difficult to accept. And developing a culture of curiosity that embraces trying new ways of tackling crime and disorder problems might be the most important of all. The starting point is to recognize that if personal experience remains the default decision-making tool, then we inhibit the development of better evidence. And we should realize that approach is harmful to communities and colleagues alike.


[1] This quote (sometimes replacing decisions with judgement) is attributed to various sources, but the most common is Mark Twain. It should also be pointed out that my fondness for checklists stems from one of the aviation industry’s attempts to reduce the poor decision-making learning spiral.

[2] I’m grateful to a smarter friend for pointing this out to me.

[3] Ratcliffe, J.H. and M.J. McCullagh, Chasing ghosts? Police perception of high crime areas. British Journal of Criminology, 2001. 41(2): p. 330-341.

[4] Weinborn, C., et al., Hotspots vs. harmspots: Shifting the focus from counts to harm in the criminology of place. Applied Geography, 2017. Online first. 

[5] Barends, E., D.M. Rousseau, and R.B. Briner, Evidence-Based Management: The Basic Principles. 2014, Amsterdam: Center for Evidence-Based Management.