The Statistical Analysis

Talking with African-Americans leaves little doubt that pretextual traffic stops have a profound impact on each individual stopped, and on all blacks collectively. There is also no doubt that blacks view this not as a series of isolated incidents and anecdotes, but as a long-standing pattern of law enforcement. For those subjected to these practices, pretextual stops are nothing less than blatant racial discrimination in the enforcement of the criminal law.

But is there proof that would substantiate those strongly-held beliefs? What statistics exist that would allow one to conclude, to an acceptable degree of certainty, that "driving while black" is, indeed, more than just the sum of many individual stories?

Data on this problem are not easy to come by. This is, in part, because the problem has only recently been recognized beyond the black community. It may also be because records concerning police conduct are either irregular or nonexistent. But it may also be because there is active hostility in the law enforcement community to the idea of keeping comprehensive records of traffic stops. In 1997, Representative John Conyers of Michigan introduced H.R. 118, the Traffic Stops Statistics Act, which would require the Department of Justice to collect and analyze data on all traffic stops around the country--including the race of the driver, whether a search took place, and the legal justification for the search. When the bill passed the House with unanimous, bipartisan support the National Association of Police Organizations (NAPO), an umbrella group representing more than 4,000 police interest groups across the country, announced its strong opposition to the bill. Officers would "resent" having to collect the data, a spokesman for the group said. Moreover, there is "no pressing need or justification" for collecting the data. In other words, there is no problem, so there is no need to collect data. NAPO's opposition was enough to kill the bill in the Senate in the 105th Congress. As a consequence, there is now no requirement at the federal level that law enforcement agencies collect data on traffic stops that include race. Thus, all of the data gathering so far has been the result of statistical inquiry in lawsuits or independent academic research.

A. New Jersey

The most rigorous statistical analysis of the racial distribution of traffic stops was performed in New Jersey by Dr. John Lamberth of Temple University. In the late 1980s and early 1990s, African-Americans often complained that police stopped them on the New Jersey Turnpike more frequently than their numbers on that road would have predicted. Similarly, public defenders in the area had observed that "a strikingly high proportion of cases arising from stops and searches on the New Jersey Turnpike involve black persons." In 1994, the problem was brought to the state court's attention in State v. Pedro Soto, in which the defendant alleged that he had been stopped because of his ethnicity. The defendant sought to have the evidence gathered as a result of the stop suppressed as the fruit of an illegal seizure. Lamberth served as a defense expert in the case. His report is a virtual tutorial on how to apply statistical analysis to this type of problem.

The goal of Lamberth's study was "to determine if the State Police stop, investigate, and arrest black travelers at rates significantly disproportionate to the percentage of blacks in the traveling population, so as to suggest the existence of an official or de facto policy of targeting blacks for investigation and arrest." To do this, Lamberth designed a research methodology to determine two things: first, the rate at which blacks were being stopped, ticketed, and/or arrested on the relevant part of the highway, and second, the percentage of blacks among travelers on that same stretch of road.

To gather data concerning the rate at which blacks were stopped, ticketed and arrested, Lamberth reviewed and reconstructed three types of information received in discovery from the state: reports of all arrests that resulted from stops on the turnpike from April of 1988 through May of 1991, patrol activity logs from randomly selected days from 1988 through 1991, and police radio logs from randomly selected days from 1988 through 1991. Many of these records identified the race of the driver or passenger.

Then Lamberth sought to measure the racial composition of the traveling public on the road. He did this through a turnpike population census--direct observation by teams of research assistants who counted the cars on the road and tabulated whether the driver or another occupant appeared black. During these observations, teams of observers sat at the side of the road for randomly selected periods of 75 minutes from 8:00 a.m. to 8:00 p.m. To ensure further precision, Lamberth also designed another census procedure--a turnpike violation census. This was a rolling survey by teams of observers in cars moving in traffic on the highway, with the cruise control calibrated and set at five miles per hour above the speed limit. The teams observed each car that they passed or that passed them, noted the race of the driver, and also noted whether or not the driver was exceeding the speed limit.

The teams recorded data on more than forty-two thousand cars. With these observations, Lamberth was able to compare the percentages of African- Americans drivers who are stopped, ticketed, and arrested, to their relative presence on the road. This data enabled him to carefully and rigorously test whether blacks were in fact being disproportionately targeted for stops.

By any standard, the results of Lamberth's analysis are startling. First, the turnpike violator census, in which observers in moving cars recorded the races and speeds of the cars around them, showed that blacks and whites violated the traffic laws at almost exactly the same rate; there was no statistically significant difference in the way they drove. Thus, driving behavior alone could not explain differences in how police might treat black and white drivers. With regard to arrests, 73.2% of those stopped and arrested were black, while only 13.5% of the cars on the road had a black driver or passenger. Lambert notes that the disparity between these two numbers "is statistically vast." The number of standard deviations present--54.27--means that the probability that the racial disparity is a random result "is infinitesimally small." Radio and patrol logs yielded similar results. Blacks are approximately 35% of those stopped, though they are only 13.5% of those on the road--19.45 standard deviations. Considering all stops in all three types of records surveyed, the chance that 34.9% of the cars combined would have black drivers or occupants "is substantially less than one in one billion." This led Lamberth to the following conclusion:

Absent some other explanation for the dramatically disproportionate number of stops of blacks, it would appear that the race of the occupants and/or drivers of the cars is a decisive factor or a factor with great explanatory power. I can say to a reasonable degree of statistical probability that the disparity outlined here is strongly consistent with the existence of a discriminatory policy, official or de facto, of targeting blacks for stop and investigation. . . .. . . .. . . Put bluntly, the statistics demonstrate that in a population of blacks and whites which is (legally) virtually universally subject to police stop for traffic law violation, (cf. the turnpike violator census), blacks in general are several times more likely to be stopped than non-blacks.

B. Maryland

A short time after completing his analysis of the New Jersey data, Lamberth also conducted a study of traffic stops by the Maryland State Police on Interstate 95 between Baltimore and the Delaware border. In 1993, an African-American Harvard Law School graduate named Robert Wilkins filed a federal lawsuit against the Maryland State Police. Wilkins alleged that the police stopped him as he was driving with his family, questioned them and searched the car with a drug-sniffing dog because of their race. When a State Police memo surfaced during discovery instructing troopers to look for drug couriers who were described as "predominantly black males and black females," the State Police settled with Wilkins. As part of the settlement, the police agreed to give the court data on every stop followed by a search conducted with the driver's consent or with a dog for three years. The data also were to include the race of the driver.

With this data, Lamberth used a rolling survey, similar to the one in New Jersey, to determine the racial breakdown of the driving population. Lamberth's assistants observed almost 6,000 cars over approximately 42 randomly distributed hours. As he had in New Jersey, Lamberth concluded that blacks and whites drove no differently; the percentages of blacks and whites violating the traffic code were virtually indistinguishable. More importantly, Lamberth's analysis found that although 17.5% of the population violating the traffic code on the road he studied was black, more than 72% of those stopped and searched were black. In more than 80% of the cases, the person stopped and searched was a member of some racial minority. The disparity between 17.5% black and 72% stopped includes 34.6 standard deviations. Such statistical significance, Lamberth said, "is literally off the charts." Even while exhibiting appropriate caution, Lamberth came to a devastating conclusion.

While no one can know the motivation of each individual trooper in conducting a traffic stop, the statistics presented herein, representing a broad and detailed sample of highly appropriate data, show without question a racially discriminatory impact on blacks . . . from state police behavior along I-95. The disparities are sufficiently great that taken as a whole, they are consistent and strongly support the assertion that the state police targeted the community of black motorists for stop, detention, and investigation within the Interstate 95 corridor.

C. Ohio

In the Spring of 1998, several members of the Ohio General Assembly began to consider whether to propose legislation that would require police departments to collect data on traffic stops. But in order to sponsor such a bill, the legislators wanted some preliminary statistical evidence--a prima facie case, one could say--of the existence of the problem. This would help them persuade their colleagues to support the effort, they said. I was asked to gather this preliminary evidence. The methodology used here presents a case study in how to analyze this type of problem when the best type of data to do so is not available.

In the most fundamental ways, the task was the same as Lamberth's had been in both New Jersey and Maryland: use statistics to test whether blacks in Ohio were being stopped in numbers disproportionate to their presence in the driving population. Doing this would require data on stops broken down by race, and a comparison of those numbers to the percentage of black drivers on the roads. But if the goal was the same, two circumstances made the task considerably more difficult to accomplish in Ohio. First, Ohio does not collect statewide data on traffic stops that can be correlated with race. In fact, no police department of any sizeable city in the state keeps any data on all of its traffic stops that could be broken down by race. Second, the state legislators wanted some preliminary statistics to demonstrate that "driving while black" was a problem in all of Ohio, or at least in some significant--and different--parts of the whole state. While Lamberth's stationary and rolling survey methods worked well to ascertain the driving populations of particular stretches of individual, limited access highways, those methods were obviously resource- and labor-intensive. Applying the same methods to an entire city--even a medium-sized one--would entail duplicating the Lamberth approach on many major roads to get a complete picture. It would be impractical, not to mention prohibitively expensive, to do this in communities across an entire state. Thus, different methods had to be found.

To determine the percentage of blacks stopped, data was obtained from municipal courts in four Ohio cities. Municipal courts in Ohio handle all low-level criminal cases and virtually all of the traffic citations issued in the state. Most of these courts also generate a computer file for each case, which includes the race of the defendant as part of a physical description. This data provided the basis for a breakdown of all tickets given by the race of the driver.

The downside of using the municipal court data is that it only includes stops in which citations were given. Stops resulting in no action or a warning are not included. In all likelihood, using tickets alone might underestimate any racial bias that is present because police might not ticket blacks stopped for nontraffic purposes. Since using tickets could underestimate any possible racial bias, any resulting calculations are conservative and tend to give law enforcement the benefit of the doubt. Similarly, the way the racial statistics are grouped in the analysis is also conservative because the numbers are limited to only two categories of drivers: black and nonblack. In other words, all minorities other than African- Americans are lumped together with whites, even though some of these other minorities, notably Hispanics, have also complained about targeted stops directed at them. Using conservative assumptions means that if a bias does show up in the analysis, we can be relatively confident that it actually exists.

The percentage of all tickets in 1996, 1997, and the first four months of 1998 that were issued to blacks by the Akron, Dayton, and Toledo Police Departments and all of the police departments in Franklin County are set out in Table 1.

With ticketing percentages used as a measure of stops, attention turns to the other number needed for the analysis: the presence of blacks in the driving population. Given the concerns about the use of Lamberth's method in a statewide, preliminary study, another approach--a less exact one than direct observation, to be sure, but one that would yield a reasonable estimate of the driving population--was devised. Data from the U.S. Census breaks down the populations of states, counties, and individual cities by race and by age. This data is readily available and easy to use. Using this data, a reasonable basis for comparing ticketing percentages can be constructed: blacks versus nonblacks in the driving age population. This was done by breaking down the general population by race and by age. By selecting a lower and upper age limit--fifteen and seventy-five, respectively--for driving age, the data yield a reasonable reflection of what we would expect to find if we surveyed the roads themselves. The data on driving age population can also be sharpened by using information from the National Personal Transportation Survey, a study done every five years by the Federal Highway Administration of the U.S. Department of Transportation. The 1990 survey indicates that 21% of black households do not own a vehicle. If the driving age population figure is reduced by 21%, this gives us another baseline with which to make a comparison to the ticketing percentages. Both baselines--black driving age population, and black driving age population less 21%--for Akron, Dayton, Toledo, and Franklin County are set out in Table 2.

Table 2. Population Baselines

The ticketing percentages in Table 1 and the baselines in Table 2 can then be compared by constructing a "likelihood ratio" that will show whether blacks are receiving tickets in numbers that are out of proportion to their presence in the driving age population and the driving age population less 21%. The likelihood ratio will allow the following sentence to be completed: "If you're black, you're ___ times as likely to be ticketed by this police department than if you are not black." A likelihood ratio of approximately one means that blacks received tickets in roughly the proportion one would expect, given their presence in the driving age population. A likelihood ratio of much greater than one indicates that blacks received tickets at a rate higher than would be expected. Using both baselines--the black driving age population, and the black driving age population less 21%-- the likelihood ratios for Akron, Dayton, Toledo and Franklin County are presented in Table 3.

Table 3. Likelihood Ratio "If You're Black, You're __ Times as Likely

to Get a Ticket in This City Than if You Are Not Black"

Table 4 combines population baselines from Table 2 and likelihood ratios from Table 3.

Table 4. Combined Population Baselines and Likelihood Ratios

The method used here to attempt to discover whether "driving while black" is a problem in Ohio is less exact than the observation-based method used in New Jersey and Maryland. There are assumptions built into the analysis at several points in an attempt to arrive at reasonable substitutes for observation-based data. Since better data do not exist, all of the assumptions made in the analysis involve some speculation. But all of the assumptions are conservative, calculated to err on the side of caution. According to sociologist and criminologist Joseph E. Jacoby, the numbers used here probably are flawed because blacks are probably "at an even greater risk of being stopped" than these numbers show. For example, blacks are likely to drive fewer miles than whites, which suggests that police have fewer opportunities to stop blacks for traffic violations. In statistical terms, the biases in the assumptions are additive, not offsetting.

What do these figures mean? Even when conservative assumptions are built in, likelihood ratios for Akron, Dayton, Toledo, and Franklin County, Ohio, all either approach or exceed 2.0. In other words, blacks are about twice as likely to be ticketed as nonblacks. When the fact that 21% of black households do not own a vehicle is factored in, the ratios rise, with some approaching 3.0. Assuming that ticketing is a fair mirror of traffic stops in general, the data suggest that a "driving while black" problem does indeed exist in Ohio. There may be race-neutral explanations for the statistical pattern, but none seem obvious. At the very least, further study--something as accurate and exacting as Lamberth's studies in New Jersey and Maryland--is needed.