Not all evidence is created equally (an update)

The Australian and New Zealand Society of Evidence Based Policing (@ANZSEBP) recently retweeted a graphic from an old blog of mine, so this seems a good time to update and explain it a little.

The chart above is adapted from various sources and emphasizes quantitative studies and randomized trials. Some argue that randomized trials can be of limited value, or difficult to implement, and that observational studies and other sources of information can inform policing. This is all true. Moreover, qualitative research can be useful when interpreting evaluation results, seeking insights into why programs succeed or fail, and considering where they can go moving forward. But if you have an opportunity to conduct an evaluation, try and design it to get the best possible assessment of the program.

Any research field has variable levels of what is called methodological quality. If you think all evaluations are useful for deciding how we spend our money, then boy, do I have a bridge to sell you!

Just look through Amazon. Reviewers rarely compare one product against another. You more frequently find five-star reviews alongside comments such as “Can’t wait to try this!” or “It arrived on time and works as advertised”. Your widget might work as advertised, but better than other widgets?

One of the biggest challenges evaluators encounter is rejecting competing explanations for a crime drop. Here’s a recent example. San Francisco’s police department credited crisis intervention training with a reduction in police use of force incidents. Simply noting a change in numbers doesn’t however rule out a range of other possible explanations, such as officers conducting fewer proactive field investigations or making fewer arrests (activities that can sometimes spark an incident).

Not to mention, it is not uncommon for two or three different programs to claim credit for crime drops in the same area.

The center column of my updated figure now shows examples of each level. If terms like cross-sectional or longitudinal feel unnecessarily technical—welcome to academic jargon—then the examples may help. You might make the connection between my example of the license plate readers and San Francisco’s crisis intervention training.

San Fran’s CIT program scores at best a 2 (because of a simplistic pre-post claim), or worse one of the zeroes, because it is an internal non-peer reviewed assertion. The lowest zero level probably seems harsh on police chiefs, but many are unfamiliar with, and do not review, the research when the media calls or they write their magnum opus. They trade on their “expertise” and hope or believe their authority is a replacement for a lack of knowledge (unfortunately, it frequently works).

Experience is valuable, but it is also vulnerable to many potential biases that make it less reliable.

And when academics get quoted in newspapers, it goes through too many filters and is usually too brief, to be a reliable source for decision-makers.

For the other zero, while I recognize some police departments do exemplary research and may be impervious to political and internal pressure, regretfully, this is rarely the case. Third party evaluations often bring more rigor and impartiality.

Once we hit level 3 we cross an important threshold. Writing on evidence-based policing (EBP), Larry Sherman argued “the bare minimum, rock-bottom standard for EBP is this: a comparison group is essential to every test to be included as ‘evidence’”. Above level 2 we cross this hurdle, hence the chart background turns from red (Danger Will Robinson!) or yellow (bees!), to green.

What’s suspect or just interesting, becomes what’s promising or what works.

Up at level 5 we have experiments that randomize treatment and control groups/areas, because (in principle) they can rule out most of the problems associated with less rigorous studies. For example, by limiting our capacity to influence where or to whom a program is applied, we remove (or at least reduce) the risk of selection bias. I have encountered police commanders who all but demanded their pet area receive a patrol intervention, only to be thwarted by randomization. Would the program have worked in those areas anyway, or just because they were paying attention to those areas already?

Good randomization studies can rule out a large swathe of competing explanations, and this approach remains the strongest research design for testing many ideas (I don’t recommend it for parachutes).

It is sometimes, incorrectly, argued that randomization is unethical because it withholds benefits from the control units or areas.

We have evaluations precisely because we have not proven certain programs work. Randomization is therefore a highly ethical approach to gauging the value of spending taxpayer dollars.

Finally, randomized experiments usually contribute to the 5* meta-analyses that examine the totality of evidence for a crime reduction program. The real world is messy, and systematic reviews conducted by trained analysts are vital tools to help us make sense of complicated areas. Within a systematic review, a single study find its place in the wider entirety of research, making its contribution to policy knowledge.

There is of course much more to understand about this area, and there are numerous verbose books about research design and evaluation methodology. Until you are brave enough for that, I hope this short, non-technical overview helps you understand the graphic and appreciate that not all research is created equal.

Hey chief, stop looking for the silver bullet

I recently had the pleasure of watching a good friend talk to a room full of police leaders about evidence-based policing. Even though the presentation was thoughtful and succinct, listening to their comments and questions afterwards, it was clear that a few chiefs were still searching for the mythical silver bullet. I presented next, and got a similar response from a couple of attendees: “What can we do that just works?”

Ah, the illusive silver bullet known to slay all werewolves.

The silver bullet is a police-led initiative that is guaranteed to cut crime in half, doesn’t require resources or incur overtime, won’t infuriate the union, improves police legitimacy, is media friendly, popular with politicians, doesn’t generate complaints or litigation, and can be implemented by Monday. It is also mythical.

The bullet doesn’t exist because no strategy—however well supported by research—is guaranteed to succeed. I know that you might have read some web sites that list what works and what doesn’t, but they all need a Barry Bonds-type asterisk. The asterisk here means ‘it depends.’ There are two reasons for the asterisk; strategy validity and implementation effectiveness.

Strategy validity is comprised of the effectiveness of the strategy, as well as its external validity. Effectiveness is important because there are some tactics that generally don’t work. The classic examples are D.A.R.E. and Scared Straight. I managed to upset the commandant of a police academy by pointing this out after he had dedicated over a decade of his life as a D.A.R.E. instructor. “If I can save just one child, it’s been worth it,” he said as he stormed off. And he might have been right. Perhaps he was that instructor who managed to succeed. Unfortunately, if we are to be evidence-based, given the preponderance of evidence, it is likely his efforts were in vain.

External validity is an indication of the extent to which a project that worked in one place will work in another. For example, the Philadelphia Foot Patrol Experiment found that foot patrols in high crime hot spots reduced violence by 23 percent compared to equivalent comparison foot beats. As a result, in Philadelphia, new recruits out of the academy automatically go onto foot beats for the first part of their service.

But what if you are in a rural area with low population density? As Evan Sorg and I write (borrowing from Pete Moskos), if the mail carrier isn’t walking, it probably doesn’t make sense for the cops to be either. Some strategies have greater applicability to a wider range of police departments. Most places have experimented with variations of Compstat at some point, whereas fewer departments have explored the value of fixed wing aircraft. The external validity of a tactic is therefore a factor to consider in estimating likely success.

The strategy validity is a combination of effectiveness and external validity, but it is of little relevance if your tactic is not implemented properly. That leads to our second factor: implementation effectiveness.

Any chief can google a tactic that another police leader is championing. But will it be implemented in the same way? There is so much variance in operational implementation, it is impossible to predict success. I was a member of the recent National Academies of Sciences consensus panel on proactive policing that rated focused deterrence strategies like Operation Ceasefire as effective; but in Baltimore, poor implementation was blamed for its lack of success and ultimate demise.

Our panel also rated hot spots policing as effective. The police chief who ran the intervention evaluated by David Weisburd and colleagues was therefore no doubt disappointed that it didn’t affect anything. But with “aggressive order maintenance policing” implemented for only three hours a week, what did the department expect?

Because your department might adopt a strategy that is fundamentally flawed or is not appropriate for your environment, or you might not implement it effectively, nobody can say for certain whether it will succeed. And it is not just about money. If you have a weak department culture, investing in body-worn cameras will have limited value if officers don’t trust management and turn the cameras off.

No silver bullet, but we do have evidence-based policing

While we don’t have a silver bullet, an understanding of evidence-based policing can indicate the next best thing. EBP can maximize your chances of success.

To demonstrate, let’s put some ball parked probabilities on these ideas. Say you choose a strategy that has positive evaluations and would be appropriate for your town. In other words, an effective strategy with high external validity. We could therefore score strategy validity on a scale of 0 to 1 and give this a success probability of 0.8.

Then let’s say that you are prepared to support the strategy with enough resources, targeted and concentrated in the right places, and for enough time (what is called dosage) to maximize your chances of being successful. Trouble is, we rarely know what is enough resources, but for argument’s sake, lets ballpark this implementation effectiveness at 0.9.

Crime reduction success = Strategy validity x Implementation effectiveness

If crime reduction success = strategy validity x implementation effectiveness, then your chance of crime reduction success is 0.8 x 0.9 = 0.72. You have a 72% chance your strategy will be effective.

But imagine your strategy has a low track record of success. Perhaps it only has a 20% chance of succeeding (strategy validity of 0.2). Even if you implement the hell out of it, you are still unlikely to reduce crime. If your implementation effectiveness is 0.95, you still only have a crime reduction success of 19% (0.2 x 0.95).

Equally, if you pick a highly regarded strategy (let’s give hot spots policing a strategy validity of 0.9), you rapidly lose effect if you don’t put the effort and resources behind it. If your implementation effectiveness is estimated at just 0.5, then your chances of crime reduction success drop to just 45%.

The ‘so what?’

When crime scientists say ‘it depends’ it isn’t because we enjoy sitting on the fence. It’s because we recognize that no tactic has a strategy validity of 100%. We also know that a city might invest so little in it that a good idea can fail because of weak implementation. In a recent op-ed, I lamented that my own city invests only $130,000 in focused deterrence (rated effective in evidence-based evaluations) from a violence prevention budget of $48m.

We are not yet at the stage where we can assign rates between 0 and 1 on strategies or implementations, though forest plots (for example) might be a starting point. However, the basic idea in this blog might help you understand how the effects of strategy validity and implementation effectiveness interact. And why it is so important to score highly on both.

Evidence-based policing will not give you the silver bullet. Nothing can, and you should be suspicious of anyone promising it. But evidence-based policing can evaluate a tactic’s track record of success and when it might be best deployed (strategy validity). It can also describe how it was operationalized successfully so you can maximize your implementation effectiveness.

It’s the closest thing to a silver bullet you will get, absent an actual werewolf problem.

Reducing crime increases job satisfaction

There are some lousy jobs in policing.
Back in my time in the Met police, being a custody sergeant was generally seen as the worst job in the nick, and I can imagine being a diver searching for dead bodies with your hands in the zero visibility of the Thames River isn’t exactly a giggle either. There are however also certain roles that can create some measure of job satisfaction. In my work with colleagues in Philadelphia, we found that many officers enjoyed foot patrol (when they volunteered for it) during the Philadelphia Policing Tactics Experiment, and community policing officers have indicated positive job satisfaction with that role. Job satisfaction is important, because it is a pretty reliable indication of job performance.
Because of the work of Victoria Sytsma and Eric Piza, we now might also be able to add problem-solving to the list. In a recent article in the journal Policing: A Journal of Policy and Practice, they report on a survey they gave to over 900 police officers in the Toronto Police Service (Ontario, Canada). While they only got 178 responses (come on TPS), there were sufficient to get some understanding of response and community policing in Toronto. The vast majority of officers were either ‘somewhat’ or ‘very’ satisfied with their job, which is a credit to TPS, and maybe Canadian cops in general.
It gets interesting when they look at the relationship between job satisfaction and frequency of public interaction. 38% of officers who answer calls for service reported being very satisfied in their current job assignment, compared to 62% of those who had contact with the public primarily for the purpose of problem-solving. As they explain, “Engaging in problem-solving increases the odds of selecting a very positive ranking of job satisfaction over the combination of each of the three lower categories by 112% (OR = 2.12), controlling for frequent public interaction.” In other words, it doesn’t matter how often you interact with the public, it is the nature of that interaction that is important.
When I was a young cop, it was clear to me that response officers only ever encountered people who were stressed. This was either because they were recently the victim of crime (hence the police crime report) or they had been stopped by the police as potential suspects. Foot patrol and other community roles can bring officers into contact with people in more normal circumstances, often in very positive ways. As Evan Sorg and I discuss in our book Foot Patrol: Revisiting the Cornerstone of Policing, officers with more engagement with the ‘normal’ community in an area can have higher levels of job satisfaction.
There are some limitations to the research reported by Sytsma and Piza, not the least of which is the absence of pre-post measures. In other words, they didn’t survey groups of officers before and after their assignment to Toronto’s Community Response Unit and compare those results to a group of control officers. But that being said, this is work that suggests a positive relationship, and the research was a really nice idea that opens up all sorts of different ways to think about job satisfaction in non-response roles. Problem-solving and problem-oriented policing have repeatedly been shown to reduce crime and are effective tactics to combating both crime and disorder. We might also be able to add job satisfaction as a further benefit of problem-solving work.
(The full article is in press as Sytsma, V. A. & Piza, E. L. (2017). Quality over quantity: Assessing the impact of frequent public interaction compared to problem-solving activities on police officer job satisfaction. Policing: A Journal of Policy and Practice. Photo source: blog post Thursday, May 24, 2012)

Not all evidence is created equally

In the policy world, not all evidence is created equally.

I’m not talking about forensic or criminal evidence (though those areas have hierarchies too). I’m referring to the evidence we need to make a good policy choice. Policy decisions in policing include if my police department should support a second responder program to prevent domestic abuse, or should dedicate officers to teach D.A.R.E. in our local school? (The answer to both questions is no). The evidence I’m talking about here is not just ‘what works’ to reduce crime significantly, but also how it works (what mechanism is taking place), and when it works (in what contexts the tactic may be effective).

We can harvest ideas about what might work from a variety of sources. Telep and Lum found that a library database was the least accessed source for officers when learning about what tactics might work[1]. That might be because for most police officers, a deep dive into the academic literature is like being asked to do foot patrol in Hades while wearing a nylon ballistic vest. But it also means that the ‘what works’, ‘how works’, and ‘when works’ of crime reduction are never fully understood. Too many cops unfortunately rely on their intuition, opinion or unreliable sources, as noted by Ken Pease and Jason Roach:

Police officers may be persuaded by the experience of other officers, but seldom by academic research, however extensive and sophisticated. Collegiality among police officers is an enduring feature of police culture. Most officers are not aware of, are not taught about and choose not to seek out relevant academic research. When launching local initiatives, their first action tends to be the arrangement of visits to similar initiatives in other forces, rather than taking to the journals.[2]

In fact, not only do officers favor information from other cops, but it also has to be from the right police officers. Just about everyone in policing at some point has been told to forget what they learned at the police academy. That was certainly my experience. And those courses were usually taught by experienced police officers! It’s too easy to end up with a very narrow and restricted field of experience on which to draw.

Fortunately, while a basic understanding of different research qualities is helpful, you do not need to have an advanced degree in research methodology to be able to implement evidence-based policing from a range of evidence sources. It’s sufficient to appreciate that there is a hierarchy of evidence in which to place your trust, and to have a rudimentary understanding of the differences between them. I’m not dismissing any forms of research, but I am saying that some research is more reliable than others and more useful for operational decision-making. For example internal department research may not be great for choosing between prevention strategy options, but it is hugely useful for identifying the range of problems.

There are a number of hierarchies of evidence available. Criminologists are familiar with the Maryland Scientific Methods Scale devised by Larry Sherman and colleagues[3]. In this scale, studies are ranked from 1 (weak) to 5 (strong) based on the study’s capacity to show causality and limit the effects of selection bias. Selection bias occurs when researchers choose participants in a study rather than allow random selection. It’s not necessarily an intent to harm the study, but it is a common problem. When we select participants or places to be part of a study, we can unconsciously choose places that will perform better than randomly selected sites. We cherry-pick the subjects, and so bias the study and get the result we want. It’s why so many supposedly great projects have been difficult to replicate.

But the Maryland scale only addresses academic research, and police officers get knowledge and information from a range of sources. Rob Briner’s work in evidence-based management also stresses this. Go into any police canteen or break room and you hear anecdote after anecdote (I wrote a few weeks ago about the challenges of ‘experience’). These examples of homespun wisdom—also known as carefully selected case studies—are often illustrative of an unusual case and not a story of the mundane and ordinary that we deal with every day.

This professional expertise can also be supplemented by the stakeholder concerns of the local community (including the public or the local enforcement community such as other police departments or prosecutors). Knowing what is important to stakeholders and innovative approaches adopted by colleagues is useful in the hunt for a solution to a crime or disorder problem.

Police also get information from experts and internal reports. In larger forces, reports from statistics or crime analysis units can be important sources of information. Organizational data are therefore useful to officers who try and replicate crime reduction that a colleague in a different district appears to have achieved. All of these sources are important to some police officers, and they deserve a place on the hierarchy. But because they are hand selected, can be abnormal, might be influenced internally, or have not been subjected to a lot of scrutiny, they get a lower place on the chart.

In the figure on this page, I’ve pulled together a hierarchy of evidence from a variety of sources and tried to combine them in such a way that you can appreciate different sources and their benefits and concerns. And hopefully this might help you appreciate a little how to interpret the evidence from sources such as The National Institute of Justice’s website and The UK College of Policing’s Crime Reduction Toolkit. In a later post I’ll try and expand on each of the levels (from 0 to 5*).


  1. Telep, C.W. and C. Lum, The receptivity of officers to empirical research and evidence-based policing: An examination of survey data from three agencies. Police Quarterly, 2014. 17(4): p. 359-385.
  2. Pease, K. and J. Roach, How to morph experience into evidence in Advances in Evidence-Based Policing, J. Knutsson and L. Tompson, Editors. 2017, Routledge. p. 84-97.
  3. Sherman, L.W., et al., Preventing Crime: What works, what doesn’t, what’s promising. 1998, National Institute of Justice: Washington DC.

It’s time for Compstat to change

If we are to promote more thoughtful and evidence-based policing, then Compstat has to change. The Compstat-type crime management meeting has its origins in Bill Bratton’s need to extract greater accountability from NYPD precinct commanders in late 1990s New York. It was definitely innovative for policing at the time, and instigated many initiatives that are hugely beneficial to modern policing (such as the growth of crime mapping). And arguably it has been successful in promoting greater reflexivity from middle managers; however these days the flaws are increasingly apparent.

Over my years of watching Compstat-type meetings in a number of departments, I’ve observed everyone settle into their Compstat role relatively comfortably. Well almost. The mid-level local area commander who has to field questions is often a little uneasy, but these days few careers are destroyed in Compstat. A little preparation, some confidence, and a handful of quick statistics or case details to bullshit through the tough parts will all see a shrewd commander escape unscathed.

In turn, the executives know their role. They stare intently at the map, ask about a crime hot spot or two, perhaps interrogate a little on a case just to check the commander has some specifics on hand, and then volunteer thoughts on a strategy the commander should try—just to demonstrate their experience. It’s an easy role because it doesn’t require any preparation. In turn, the area commander pledges to increase patrols in the neighborhood and everyone commits to reviewing progress next month, safe in the knowledge that little review will actually take place because by then new dots will have appeared on the map to absorb everyone’s attention. It’s a one-trick pony and everyone is comfortable with the trick.

There are some glaring problems with Compstat. The first is that the analysis is weak and often just based on a map of dots or, if the department is adventurous, crime hot spots. Unfortunately, a map of crime hot spots should be the start of an analysis, not the conclusion. It’s great for telling us what is going on, but this sort of map can’t really tell us why. We need more information and intelligence to get to why. And why is vital if we are to implement a successful crime reduction strategy.

We never get beyond this basic map because of the second problem: the frequent push to make an operational decision immediately. When command staff have to magic up a response on the spot, the result is often a superficial operational choice. Nobody wants to appear indecisive, but with crime control it can be disastrous. Too few commanders ever request more time to do more analysis, or time to consider the evidence base for their operational strategies. It’s as if asking to think more about a complex problem would be seen as weak or too ‘clever’. I concede that tackling an emerging crime spike might be valuable (though they often regress to the mean, or as Sir Francis Galton called it in 1886, regression towards mediocrity). Many Compstat issues however, revolve around chronic, long-term problems where a few days isn’t going to make much difference. We should adopt the attitude that it’s better to have a thoughtfully considered successful strategy next week than a failing one this week.

Because of the pressure to miracle a working strategy out of thin air, area commanders usually default to a limited set of standard approaches, saturation patrol with uniform resources being the one that I see at least 90 percent of the time. And it’s applied to everything, regardless of whether there is any likelihood that it will impact the problem. It is suggested by executives and embraced by local area commanders because it is how we’ve always escaped from Compstat. Few question saturation patrols, there is some evidence it works in the short term, and it’s a non-threatening traditional policing approach that everyone understands. Saturation patrol is like a favorite winter coat, except that we like to wear it all year round.

Third, in the absence of a more thoughtful and evidence-based process, too many decisions and views lack any evidential support and instead are driven by personal views. There is a scene in the movie Moneyball where all the old baseball scouts are giving their thoughts on which players the team should buy, based only on the scouts’ experience, opinion and personal judgment. They ignore the nerd in the corner who has real data and figures … and some insight. They even question if he has to be in the room. In the movie, the data analyst is disparaged, even though he doesn’t bring an opinion or intuition to the table. He brings data analysis, and the data don’t care how long you have been in the business.

Too many Compstat meetings are reminiscent of this scene. The centerpiece of many Compstat meetings is a map of crime that many are viewing for the first time. A room full of people wax lyrical on the crime problem based on their intuitive interpretation of a map of crime on the wall, and then they promote solutions for our beleaguered commander, based too often on opinion and personal judgement and too little on knowledge of the supporting evidence of the tactic’s effectiveness. Because everyone knows they have to come back in a month the strategies are inevitably short-term in nature and never evaluated. And without being evaluated, they are never discredited, so they become the go-to tactical choice ad infinitum.

So the problems with Compstat are weak analysis, rushed decision-making, and opinion-driven strategies. What might the solutions be?

The U.K.’s National Intelligence Model is a good starting point for consideration. It has a strategic and a tactical cycle. The strategic meeting attendees determine the main strategic aims and goals for the district. At a recent meeting a senior commander told me “We are usually too busy putting out fires to care about who is throwing matches around.” Any process that has some strategic direction to focus the tactical day-to-day management of a district has the capacity to keep at least one eye on the match thrower. A monthly meeting, focused on chronic district problems, can generate two or three strategic priorities.

A more regular tactical meeting is then tasked with implementing these strategic priorities. This might be a weekly meeting that can both deal with the dramas of the day as well as supervise implementation of the goals set at the strategic meeting. It is important that the tactical meeting should spend some time on the implementation of the larger strategic goals. In this way, the strategic goals are not subsumed by day-to-day dramas that often comprise the tyranny of the moment. And the tactical meeting shouldn’t set strategic goals—that is the role of the strategic working group.

I’ve previously written that Compstat has become a game of “whack-a-mole” policing with no long-term value. Dots appear, and we move the troops to the dots to try and quell the problem. Next month new dots appear somewhere else, and we do the whole thing all over again. If we don’t retain a strategic eye on long-term goals, it’s not effective policing. It’s Groundhog Day policing.

What role for experience in evidence-based policing?

I recently received an illustrative lesson in the challenges of evidence-based policing. I was asked to sit in on a meeting where a number of senior managers were pitching an idea to their commander. It required the redistribution of patrols, and they were armed with evidence that the existing beats were not in the best locations and so were not as effective as they could be. The commander sat back in his chair and said “so I have to move some of those patrols?” Yes, the area managers responded, presenting a cogent yet measured response based on a thorough data analysis supported with current academic research. The commander replied “well in my experience, they are being effective so I am not going to move them”. And at that point the meeting ended. Experience trumped data and evidence, as it often does.

All the evidence available suggested that the commander made a poor decision. When I was learning to be a pilot I heard an old flying aphorism about decisions. Good decisions come from experience, and experience comes from bad decisions[1]. Unfortunately, the profession of policing cannot lurch from an endless cycle of bad decisions as new recruits enter policing and learn the business. The modern intolerance for honest mistakes and anything-but-perfection precludes this. It’s also expensive and dangerous. Therefore how do we develop and grow a culture of good decisions? And what is the role of experience?

In praise of experience

Policing is unique in the liberal level of discretion and absence of supervision given to the least experienced officers. As a teenager when I started patrol, I can testify to the steep learning curve on entering the job. Experiences come at you thick and fast. In some ways, we learn from these experiences. Most cops absorb pretty quickly how to speak with people who are drunk or having a behavioral health crisis in a way that doesn’t end up in them rolling around in the street with fists flying. My colleagues demonstrated a style and tone and I learned from the experience they had gained over time.

We should also recognize that the evidentiary foundation for much of policing is pretty thin. We simply do not yet know much about what works and what good practice looks like. It’s not as if we have an extensive knowledge bank with which to replace experience. In recognition of this, the UK College of Policing note that “Where there is little or no formal research, other evidence such as professional consensus and peer review, may be regarded as the ‘best available’”. So practitioner judgement may help fill a void until that time when we have more research across a wider variety of policing topics. In time, this research will help officers achieve better practice. In the meantime, shared experience may be of value, if (in the words of the UK College of Policing) “gathered and documented in a careful and transparent way”.

Finally, personal intuition and opinion may not be a sound basis on which to make policy, but sometimes it can offer insights in rarely studied areas. This can prompt new ways of looking at problems. By varying experience, we can learn new ways to deal with issues. These new ways could then be tested more formally. There is definitely a place for personal judgement in the craft of policing. But the current reliance on it prevents us embracing a culture of curiosity and developing that evidence base[2]. And personal experience has other limitations.

A critique of experience

Like the commander at the start of this section, unfortunately, most police leaders don’t make decisions using the best evidence available. They overwhelmingly prefer decisions that are entrenched in their personal experience. The problem is that everyone’s experience is limited (we can’t have been everywhere and dealt with every type of incident), and in policing we receive too little feedback to actually learn many lessons.

What do I mean? As a young cop, I attended countless domestic disturbance calls armed with so little personal experience in long-term relationships it was laughable. It soon because clear that the measure of ‘success’ (against which ‘experience’ was judged) was if we got a call back to that address during that shift. If we did, I had failed. If we didn’t, I had succeeded and was on my way to gaining the moniker of ‘experienced’.

But what if the husband beat his partner to within an inch of her life within hours of my going home? Or the next week? If our shift wasn’t on duty I would never learn that my mediation and resolution attempts had been unsuccessful or worse, harmful. I would never receive important feedback and would continue to deal with domestic disturbance calls in the same way. Absent supervision and feedback, not only would I continue to act in a harmful manner, worse, my colleagues and I might think I was now experienced. They might prioritize my attendance at these calls, and perhaps eventually give me a field training role. My bad practice would now become established ‘good’ practice.

As others have noted “personal judgment alone is not a very reliable source of evidence because it is highly susceptible to systematic errors – cognitive and information-processing limits make us prone to biases that have negative effects on the quality of the decisions we make”. Even experienced police officers are not great at identifying crime hot spots[3] and do not do as well as a computer algorithm[4]. This isn’t just an issue for policing. Experts, many with many years of experience, are often poor at making forecast across a range of businesses and professions. The doctors that continued to engage in blood-letting into the latter half of the 19th century weren’t being callous. They probably had good intentions. But their well-meaning embrace of personal judgement, tradition and supposed best practice (probably learned from a medical guru) killed people.

I frequently conduct training on evidence-based and intelligence-led policing. I often run a quick test. I show officers a range of crime prevention interventions and ask which are effective. It’s rare to find anyone who can get the correct answer, and most folk are wildly off target. It’s just for fun, but illustrates how training and education in policing still remains at odds with a core activity, the reduction in crime.

A role for professional experience?

As Barends and colleagues note[5], “Different from intuition, opinion or belief, professional experience is accumulated over time through reflection on the outcomes of similar actions taken in similar situations.” It differs from personal experience because professional experience aggregates the knowledge of a variety of practitioners. It also emerges from explicit reflection on the outcomes of actions.

This explicit reflection requires feedback. When I was learning to fly, a jarring sensation and the sound of the instructor wince was the immediate feedback I needed to tell me I had not landed as smoothly as I had hoped. But flying around the traffic pattern, I immediately had another chance to prove improvement and a lesson learned. This type of immediate feedback and opportunity to improve is rare in policing. The radio has already dragged us to a different call.

For many enforcement applications, a research evaluation is essential to provide the kind of feedback that you can’t get from personal observation. The research on directed patrol for gun violence is a good example of how research evidence can improve strategy and increase public safety. Science and evaluation can replicate the experiences of hundreds of practitioners and pool that wisdom. While you can walk a single foot beat and think foot patrol is waste of time, the aggregate experiences and data from 240 officers across 60 beats tells us differently.

Tapping into scientific research findings and available organizational data (such as crime hot spot maps) and temporal charts, will enhance our professional experience. Being open to the possibility that our intuition and personal opinion may be flawed is also important, though difficult to accept. And developing a culture of curiosity that embraces trying new ways of tackling crime and disorder problems might be the most important of all. The starting point is to recognize that if personal experience remains the default decision-making tool, then we inhibit the development of better evidence. And we should realize that approach is harmful to communities and colleagues alike.

[1] This quote (sometimes replacing decisions with judgement) is attributed to various sources, but the most common is Mark Twain. It should also be pointed out that my fondness for checklists stems from one of the aviation industry’s attempts to reduce the poor decision-making learning spiral.

[2] I’m grateful to a smarter friend for pointing this out to me.

[3] Ratcliffe, J.H. and M.J. McCullagh, Chasing ghosts? Police perception of high crime areas. British Journal of Criminology, 2001. 41(2): p. 330-341.

[4] Weinborn, C., et al., Hotspots vs. harmspots: Shifting the focus from counts to harm in the criminology of place. Applied Geography, 2017. Online first. 

[5] Barends, E., D.M. Rousseau, and R.B. Briner, Evidence-Based Management: The Basic Principles. 2014, Amsterdam: Center for Evidence-Based Management.