Become a Patron! 



Excerpted From: Ben Grunwald and Jeffrey Fagan, The End of Intuition-based High-crime Areas, 107 California Law Review 345 (April 2019) (165 Footnotes) (Full Document)


bengrunwaldJeffreyFaganEvery year, police officers stop and frisk millions of pedestrians on the street. One of the most common justifications they cite for these stops is that the suspect was located in a “high-crime area” (HCA). In 2000, the Supreme Court gave formal approval to that practice in Illinois v. Wardlow, by holding that a suspect's presence in a high-crime area is relevant in determining whether an officer has reasonable suspicion to conduct a stop. In doing so, the Court sanctioned a dramatic expansion in police discretion that has impacted “almost every” case challenging the constitutionality of a stop.

Despite the importance of the decision, the Court provided remarkably little guidance on how to interpret and implement the high-crime area standard in practice. Indeed, the opinion said nothing at all about what “high-crime area” means, and the lower courts have made little progress filling the gap. As a result, officers haven't been told how to apply the high-crime area standard-- how to think about its proper geographic scope, its relevant temporal horizon, or about the kinds of crimes that are most relevant. also said nothing about the relevant evidentiary standards for establishing that an area is high crime. In response, the lower courts have been remarkably lax in scrutinizing officers' claims about high-crime areas. The most common approach is to defer to the expertise of the police officer, often adopting his bare testimony that an area is “high crime” without additional proof.

In the absence of a legal definition of high-crime areas and of meaningful judicial scrutiny, police officers enjoy wide discretion to define high-crime areas however they want. The wisdom of Wardlow as a constitutional doctrine thus depends heavily on how police officers exercise their discretion while implementing it in practice.

We argue that Wardlow depends on at least three unspoken empirical assumptions. The first assumption concerns the geographic scope of a high-crime area. The few lower courts that have confronted this question have generally agreed that high-crime areas should be analyzed through a granular geographic lens--more like a street block or intersection than a neighborhood or city.

The second assumption is that officers' assessments of high-crime areas are relatively accurate. There are some good reasons to question that assumption. For one thing, officers may not always be aware of actual crime rates, which can fluctuate over time. Their assessments might also be skewed by bureaucratic pressures to increase the number of stops they conduct even if they lack constitutional justification. And officers' assessments of high-crime areas might also be influenced by racial and socioeconomic biases based on the characteristics of suspects and neighborhoods in which their stops take place.

The third empirical assumption concerns predictive power. Like any other Fourth Amendment factor, a suspect's presence in a high-crime area only supports reasonable suspicion if that fact predicts, on average, whether a suspect is engaged in crime. Wardlow thus assumes that, controlling for other stated bases of reasonable suspicion, there is a higher probability that a suspect is engaged in a crime where the officer invokes high-crime area as a basis of a stop.

Nearly two decades have passed since the Supreme Court issued Wardlow, and yet we have almost no evidence about how police officers apply the high-crime area standard. We therefore don't know whether any of these empirical assumptions are satisfied in practice.

Our goal in this Article is to evaluate Wardlow by testing its empirical assumptions directly. To do so, we use a dataset of over two million police stops conducted by the New York Police Department (NYPD) between 2007 and 2012. The data derive from forms that officers are required to complete after every stop. The forms collect rich information on suspect demographics and the precise geographic location of each stop. The data also contain anonymized officer identifiers, which allow us to observe how the same officer behaves in different areas, and how different officers behave in the same areas. And, most important for our purposes, the forms require officers to check off a series of roughly twenty boxes, indicating the bases of suspicion that justified the stop. Fortunately, one of those boxes is for high-crime areas. We merged this dataset with crime statistics and racial and demographic information on small geographic areas in New York City.

Of course, we need to be careful about how we interpret our stop-form dataset. One possibility is that it tells us about the ex ante, subjective mental state of a police officer--that is, it describes the reasons an officer believed a stop was lawful moments before he carried it out. To a limited extent, we hope we can learn something about that internal mental process, but the data face significant limitations to serve that purpose. Indeed, officers fill out the form after completing their stops and they may therefore engage in post-hoc rationalization. Still, our data may offer a very rough proxy of what officers were thinking in the moment--a proxy that's better than anything else currently available.

Perhaps a more fitting interpretation of the stop-form data is that they describe the ex post, objective factors a police officer would use to justify a stop if he were ever asked to do so in court. This objective perspective is particularly important in the Fourth Amendment context where, under Whren v. United States, the officer's subjective mental state is irrelevant in assessing whether a stop is unconstitutional. The data are well suited to illuminate that objective perspective. For one thing, just a few years before our data begin, the check boxes on the stop forms were created as a result of a lawsuit against the NYPD to require the department to document the bases of suspicion in every stop. For another, when a stop is challenged at a suppression hearing, officers are incentivized to give testimony consistent with the contents of their stop form. Indeed, the form is typically discoverable, which means the defense can impeach an officer whose testimony deviates from it. For these reasons, our data appear well suited for examining the objective factors--including the high-crime area factor--that an officer would raise to justify each stop.

Turning to our results, our empirical analyses provide significant evidence that none of Wardlow's empirical assumptions are satisfied in practice. With respect to the first, our regression models suggest that officers often assess whether an area is high crime through a broad geographic lens. In many of our models, police precinct-level measures of crime (on average, four square miles) are substantially stronger predictors of whether an officer invokes HCA than measures of crime at a smaller level of geography, the census block group (.05 square miles). That's particularly true for violent-and property-crime stops. This suggests that officers frequently apply the high-crime area standard to large geographic areas such as police precincts.

Even more important, our results also provide little support for Wardlow's second assumption. Officers invoke HCA in 57 percent of all stops--more often than any other basis of reasonable suspicion. And, while officers invoke HCA more often in certain parts of the city than others, they frequently do so everywhere. Indeed, in 98 percent of census block groups, officers invoked HCA in at least 30 percent of stops. In other words, officers are claiming that every block in New York City is high crime at one time or another. That claim seems implausible--particularly in the “safest big city in America.”

More to the point, officers' assessments of whether areas are high crime appear inaccurate. Despite our best efforts to predict HCA based on measures of crime at different levels of geography, temporal horizons, and crime types, our most predictive models produced an R of just 0.01. In other words, actual crime rates predicted only one percent of the variation in officers' assessments of whether areas are high crime.

If actual crime rates don't explain whether an officer invokes HCA as a basis of a stop, what does? One partial answer is that racial and socioeconomic biases may influence officers' determinations. When we analyze all stops together and control for local crime conditions, we find that a given officer in a given area is more likely to invoke HCA against young, Black, male suspects. When we break the data up by the type of suspected crime, we find that the higher invocation rates against Blacks is concentrated among stops for violent crime. We also find evidence that, in assessing whether an area is high crime, officers rely on neighborhood proxies, such as the racial and socioeconomic composition of residents. For example, when we analyze all stops together, across all of our models, moving a stop from an area with virtually no Black residents to an area with 100 percent Black residents is associated with a larger increase in the probability that an officer invokes HCA than moving from the single safest area in the city to the single most dangerous. This pattern appears to be concentrated in stops where the suspected crime is a violent, drug, or weapons offense.

Inter-officer disparities might also help explain when officers invoke HCA. Controlling for area of the city, roughly a quarter of officers invoke HCA in just 25 percent of stops, while another 40 percent do so over 75 percent of the time.

These results raise strong doubts as to whether the invocation of HCA has any predictive power about whether a suspect is engaged in crime. If not, the third empirical assumption of Wardlow does not hold. We examine this question by measuring the correlation between whether an officer invokes HCA as the basis of a stop and whether that stop results in a recorded “hit”--an arrest, the recovery of a weapon, or the recovery of other contraband. Our analysis here is necessarily limited because we can only observe the suspects that were stopped; we cannot observe suspects that officers chose not to stop (perhaps because they lacked reasonable suspicion). Still, our results are informative even if they are censored. For two of our three “hit” variables--arrest and recovery of a weapon--when we control for other observable bases of suspicion, we find that the probability of an arrest or the recovery of a weapon decreases when an officer invokes HCA to justify the stop. In other words, when an officer invokes HCA, the suspect is less likely to be engaged in a crime. This suggests that HCA may not be an indicator of guilt at all. It further suggests that officers may invoke HCA to manufacture the appearance of reasonable suspicion in their weakest stops. For our third hit variable--whether the officer recovered any contraband other than a weapon--we find that the probability of a hit remains the same when the officer invokes HCA.

Taken together, our findings provide empirical evidence that Wardlow may have been wrongly decided. Indeed, implementation of the high-crime area standard appears haphazard at best and discriminatory at worst. Officers call nearly every block in the city high crime at one time or another. Their assessments of high-crime areas are only weakly correlated with actual crime rates. The suspect's race predicts whether an officer deems an area high crime as well as the actual crime rate itself. The racial composition of the area and the identity of the officer are stronger predictors of whether an officer deems an area high crime than the crime rate. And officers may even be using high-crime area as cover to bolster the appearance of constitutional validity in their weakest stops. These findings raise important questions about whether police officers can responsibly wield the discretion granted to them under Wardlow.

Of course, in this Article, we only evaluate the implementation of the high-crime area standard by one department during one time period. Officers in other departments may be applying it with greater fidelity, and we cannot rule out this possibility with our data. But we ourselves are somewhat doubtful as the NYPD is one of the most organized, centralized, data-driven, and well-funded police departments in the country.

Short of reversing Wardlow, the courts have tools at their disposal to address some of the problems we have uncovered with the doctrine. Perhaps most simply, they could demand more rigorous data in suppression hearings to support an officer's claim that an area is high crime. We suspect this solution would not go far enough, however, because it would only address the tiny fraction of stops that result in a criminal charge and motion to suppress. Courts could go further by developing more precise definitions about the geographic scope, temporal horizon, and kinds of crimes relevant in assessing whether an area is high crime. A more aggressive judicial approach might prohibit a department from using high-crime areas to justify stops if there is evidence its officers are systematically misapplying the standard. Police departments that do not faithfully implement the standard should not be able to use it to justify their stops.

We recognize that these proposals depart, at least to some extent, from how courts treat other factors under the reasonable suspicion analysis. Applying those other factors typically involves a highly discretionary, fact-bound inquiry based on the totality of circumstances and the common-sense judgments of the police officer. But, perhaps, the reason that the Fourth Amendment has taken this shape over time is that there were no other options. Historically, courts lacked access to the data needed to validate how police officers invoke Fourth Amendment factors in the field. Indeed, as the Supreme Court explained in Wardlow itself:

In reviewing the propriety of an officer's conduct, courts do not have available empirical studies dealing with inferences drawn from suspicious behavior, and we cannot reasonably demand scientific certainty from judges or law enforcement officers where none exists. Thus, the determination of reasonable suspicion must be based on commonsense judgments and inferences about human behavior.

As we try to show in this Article, that moment may be coming to an end.

Courts are not the only institutions that should reconsider how they handle the high-crime area standard. Police departments can promulgate regulations to guide officers. Technological innovation can also help. The Philadelphia Police Department recently gave patrol officers smart phones with information on crimes occurring in the surrounding area. Such devices could be used to inform officers in real time about objective crime data so that they do not need to rely on their own subjective and potentially unreliable intuitions about local crime rates. These devices could limit discretion even further by simply informing officers whether they are, at any given moment, in a high-crime area based on crime data and departmental policy.

In addition to these specific proposals, the implications of our analysis extend further--beyond the high-crime area standard--in at least two ways. First, our analysis offers a more general lesson about the response of police to different forms of judicial regulation. For example, it's perhaps unsurprising that, once courts recognized “furtive movement” as a cognizable factor in the reasonable suspicion analysis, police began to see furtive movements everywhere. That concept is so vague, slippery, and contentless that any behavior might qualify. But, in principle, the concept of a high-crime area could be operationalized in a manner that is more objective and verifiable. And yet, police officers appear able to misuse that more regulable standard too. The story of Wardlow thus teaches that, for the Fourth Amendment to impose a meaningful constraint on police discretion, the courts may need to develop more specific standards about what reasonable suspicion factors mean or, alternatively, to require that police do so through internal regulations. Leaving the definition of those factors up to line officers on the street appears to be a dangerous proposition.

Second, our findings open the door to a largely uncharted area of empirical legal scholarship on the Fourth Amendment. Indeed, officers rely on countless factors other than high-crime areas in justifying the millions of stops they conduct each year. Officers may be applying some of those factors unfaithfully as well. Our analysis is therefore just the first step. We suggest that empirical legal scholars should begin validating other bases of reasonable suspicion on which officers regularly rely. Below, we identify several methodologies that could substantially advance this research agenda.

The remainder of this paper proceeds as follows. In Part I, we briefly describe the historical development of the investigative stop and its current use in policing practice today. In Part II, we layer on the high-crime area standard, discussing Wardlow and how the lower courts have applied the doctrine. We also describe and justify the empirical assumptions of the high-crime area standard. Part III describes our data, and Part IV details the results of our empirical analysis. In Part V, we explore the implications of our findings for courts, police, and the Fourth Amendment more generally.

[. . .] 

Assistant Professor of Law, Duke University School of Law.

Isidor and Seville Sulzbacher Professor of Law, Professor of Epidemiology, Columbia University.