Year-to-date comparisons and why we should stop doing them
Year-to-date comparisons are common in both policing and the media. They involve comparing the cumulative crime count for the current year up to a certain date and comparing to the same point in the preceding year. For a Philadelphia example from April of this year, NBC reported that homicides were up 20 percent in 2017 compared to 2016. You can also find these types of comparison in the Compstat meetings of many police departments.
To gauge how reliable these mid-year estimates of doom-and-gloom are, I downloaded nine years (2007-2015) of monthly homicide counts from the Philadelphia Police Department. These are all open data available here. I calculated the overall year change as well as the cumulative change monthly from year to year. In the table below you can see a row of annual totals in grey near the bottom, below which is the target prediction as a percentage of the previous year (white text, blue background). For example, the 332 homicides in 2008 were 14.7% lower than the previous year, expressed in 2007 terms.
Let's determine that we can tolerate our prediction to be within 5 percent plus or minus the eventual difference between this year and the preceding year. That stipulates a fairly generous 10% range as indicated by the Low and High rows in blue.
Each month you can see the percentage difference between the indicated year-to-date at the end of the month, and the calendar year-to-date (YTD) for the same period in the previous year. So for example, at the end of January 2008 we had 21.9% fewer homicides than at the end of January 2007. By the time we get to December, we obviously have all the homicides for the year, so the December percentage change exactly matches the target percentage difference.
Cells highlighted with a green background have a difference on the previous year that is within our +/- 5 percent tolerance. By the end of each January, we only had one year (2012) with a percentage difference that was within 5 percent of how the city ended the year. The 57% increase in January 2011 was considerably different that the eventual 6% increase over 2010 at the end of December. When Philadelphia Magazine dramatically posted "Philly’s Murder Rate is Skyrocketing Again in 2014" on January 14th of that year, the month did indeed end up nearly 37 percent over 2013. But by year's end, the city had recorded just one homicide more than the preceding year - a less dramatic increase of 0.4%.
In fact, if we seek out a month where the difference is within our 10% range and later months will remain consistently accurate through to the end of the year, then we have to wait until the months shown with a border. 2009 performed well, however while 2010 was fairly accurate throughout the summer, the cumulative totals in September and October were more than 5% higher than the previous year when the year ended only 0.3% higher.
To use calendar YTD comparisons with any confidence, we have to wait until the end of October before we can be more than 50% confident that the year-to-date is indicative of how we will enter the New Year. And even then we still have to be cautious. There was a chance at the end of November 2010 that we would end the year with fewer homicides, though the eventual count crept into increase territory.
The bottom line is that with crimes such as homicide, we need not necessarily worry about crime panics at the beginning of the year. This isn't to say we should ever get complacent and of course every homicide is one too many; however the likely trend will only become clear by the autumn.
Alternatives exist. Moving averages seem to work okay, but another alternative I like is to compare full (annual) YTDs to the prior annual (i.e. full 12 month) YTD. So instead of (for example) comparing January-April 2010 to January to April 2009, you could compare the 12-month change May 2009-April 2010 against the May 2008-April 2009 total. I've done that in the red graph below. The first available point is December 2008 and as we know from the previous table, the preceding 12 months had outperformed the annual year 2007 by 14.7%. But then each subsequent month measures not just the calendar YTD but the 12-month YTD.
The result is a graph that shows the trend changing over time from negative (good) territory to positive (bad for homicides because it show an increase). Not only do you get a more realistic comparison that is useful throughout the year, you can see changing trend. Anything below the horizontal axis is good news - you are doing well. Above it means that your recent 12 months (measured at any point) was worse than the preceding 12 months.
You can have overlapping comparison periods. The graph in blue below compares 24 months of accumulated counts with the 24 month totals for the previous year. For example, the first point available is December 2009. This -11.7% value represents the change in total homicides from the 24 months January 2008 to December 2009 and compares it to the 24 month total from a year previous to this (January 2007 to December 2008). For comparison purposes, I have retained the same vertical scale but note the change in horizontal axis.
You can see there is more smoothing, but the general trend over time is still visible. Lots of variations available and you might want to play with different options for your crime type and crime volume.