I recently ran across an article discussing a change in laws punishing individuals who avoid fares when taking the metro. Basically, D.C. decriminalized fare evasion, turning what was punishable by a $300 fine, ten days in jail, or both to a $50 fine which was essentially unenforceable. As discussed in the linked articles, one reason for the decriminalization was that the punishments were largely hurting minority communities and there were instances of police brutality linked to enforcement. Now the pendulum is swinging back the other way and a law has been proposed which allows law enforcement to require the fare evader to supply their name and address — failure to do so can lead to detainment and a fine up to $100.
why am I writing about this in a psychology blog?
Well, in the first linked article, one justification for this law is the argument that this enforcement mechanism will be a way to reduce crime. Specifically, the Metro Transit Police Department Chief Michael Anzallo claims that based on a WMATA review of data and camera footage from January to mid-Sept 2023,
“a WMATA review of data and camera footage from January to mid-Sept. 2023 found that 97% of people who committed violent crimes on the Metro also committed fare evasion during that period. (Anzallo’s testimony was not immediately available online, nor was a copy of WMATA’s data analysis.)”
Hopefully this 97% number sends all kinds of alarm bells ringing in your head (I’d love to see that data analysis…) but for the sake of argument, let’s pretend that it is accurate. At first blush, 97% makes this policy seem like a no-brainer! However, you might also have an intuitive sense that there’s some funny-business going on here. Do we want to know the probability that a person who is a violent criminal commits fare evasion? Wouldn’t we rather know the chances that a person who is stopped for evading fares is a violent criminal?
Base-rate neglect
There’s a phenomenon in the psychology of judgment and decision making called base-rate neglect (or base-rate fallacy). We fail to factor in how often something occurs in general when judging its likelihood in a particular case. Imagine a disease that afflicts 10% of the population. There is a medical test that correctly detects a disease 80 out of 100 times when someone has it (otherwise known as sensitivity or true positive rate) and correctly concludes that a person doesn’t have it when they don’t 80 out of 100 times (known as specificity, or true negative rate.)
10% of the population have a disease.
Test detects disease when someone has it - 80%
Test correctly indicates they don’t have it - 80%
Given a positive test result, what is the probability that a randomly selected person has the disease?
70% chance they have the disease
30% chance they have the disease
You might be tempted to say 70% (as study showed that a majority of physicians did just that) — I mean, the sensitivity of the test is 80%! But the question you should be asking yourself is 80% of what? Remember, only 10% have the disease! If there were 100 people, that means 10 people in total have the disease in the population (the base-rate) — with an 80% sensitivity, the test correctly diagnoses 8 people and misses on 2. There are 90 people who don’t have the disease, which means even if the test incorrectly diagnoses a healthy person 20% of the time, that’s 20% x 90 = 18 (so 18 false positives.) That means with a positive test result, you’re still over twice as likely to not have the disease! Out of a total of 26 positive tests, 8 will have the disease, making 30% the correct answer. The reason this is so tricky is that the percentages (80%) hide the underlying base-rate information, making them easy to ignore, so it is always helpful to ask yourself 80% of what?
Applying base-rates to fare evasion
Now we can apply our medical test example to fare evasion and violent criminals! In the context of fare evasion, the police chief is focusing on the likelihood of evading a fare, given that they committed a violent crime on the metro, but violent criminals aren’t the only ones who evade fares! What we really want to know is the probability that any given fare evader is a violent criminal. In order to figure this out, we need to know what the probability that a Metro rider is a violent criminal in the first place — our base-rate!
We can get a decent estimate for the proportion of people committing violent crimes in the Metro by taking a look at the Metro Transit Police statistics, which show that in 2022, there were a total of 426 violent crimes (as defined by the FBI’s Uniform Crime Reporting (UCR) Program) committed on any given D.C. Metro property (this includes all modes of transportation offered by the D.C. WMATA.) I believe the quote was focused on the metro (e.g., the subway), so it is worth noting that we’re probably overestimating the number violent crimes by some amount. For ease of analysis, we will also assume that each of these crimes was perpetrated by a unique individual.
Estimating the number of riders using the D.C. Metro services is a bit trickier. For the purposes of this exercise, I’m going to take a straightforward approach — I’m going to start with any adult under age 65 living in the D.C. metro area as a rider (and a potential violent criminal) — this would be approx. 4,015,512 people (not that people 65 and over can’t commit violent crimes, but statistics suggest it is rare enough to discount here.) Next, I leverage a 2023 poll by The Washington Post which showed that 79% of people living in the D.C. area reported using the metro at least once (more accurately, they reported at least using it “rarely”.) This puts us at 3,172,255. It may be that this overestimates our sample of interest in some ways, but it also could be an underestimate, since folks living in the area aren’t the only one who take the metro (and people move in and out), so it has some bias going in both directions. (I tried many other approaches relying on metro ridership data as well as Census commuting-to-work data to get this number, and honestly, the chosen approach simply had fewer pitfalls — see appendix for other potential base-rates and their issues.)
Probability of Violent Criminal =
426/3,172,255 = 0.0001342 (13.42 per 100,000)
With that information, we can also calculate the probability of being a non-criminal, because this should be whatever is left over (“left over”, perhaps not being the appropriate term for over 99% of the population…)
Probability of Non-criminal =
1 - 0.00013 = 0.99987 (99,987 per 100,000)
At this point, you might already be seeing the issue here. But in order to figure out the probability of a fare evader being a violent criminal, we also need to know what the probability is of being a fare evader. Luckily According to WMATA’s dataset (from Jan. 2023 to Sept. 2023), the average rate of fare evasion in 2023 is 12.6%. So..
Probability of fare-evasion =
0.1226 (12,260 per 100,000)
Probability of paying the fare (non-evader) =
1-.1226 = 0.8774 (87,740 per 100,000)
So, given all of this information, we can build out a table with the following information:
This looks promising, but so how do we fill in these empty cells?
Thanks to the Metro Chief of Police, we are assuming that the probability that a violent criminal will evade a fare is 0.97. This allows us to calculate the proportion of violent criminals (on the Metro) who evade fares:
Violent criminals who evade fares =
0.97 x 0.0001342 = 0.0001302607 (13.02 in 100,000)
Now that we know this, we can do a little more arithmetic to fill in the rest of this table! Based on our 0.97 probability for violent criminals who evade, we can assume that 3% of violent criminals do not evade. This allows us to calculate the proportion of non-fare evading criminals:
Violent criminals who do not evade fares =
0.03 x 0.0001342 = 0.000004 (0.4 in 100,000)
Since we also know that the total probability of a fare evader is 0.1226 (12.26%), to get the probability of a non-criminal who is a fare evader, we can just subtract:
Non-criminals who evade fares =
0.1226 - 0.0001302607 = 0.1224697 (12,246.97 in 100,000)
Finally, we can do the same arithmetic for non-fare evaders who are not criminals:
Non-criminals who do not evade fares =
0.8774 - 0.0000032 = 0.877396 (87,739.6 in 100,000)
Putting this all together, we can now see the chances for every possible outcome!
So what’s the probability that a fare evader is a violent criminal?
We now have a whole lot of information here, but you might have noticed, we still haven’t answered our main question! What is the probability that a fare evader is a criminal! Thanks to our table, we know that 0.1226 of metro riders evade fares. We also know that out of all the metro riders, 0.00013 are violent criminals, and 0.1224697 are not. Since we are only really interested in fare-evasions, all we need to do is look at what proportion of that 12.26% consists of criminals! That means dividing our 0.00013 by 0.1226:
Probability that a fare evader is a violent criminal =
0.00013 / 0.1226 = 0.0011 (110 in 100,000)
We can now finally say that given what we were told by the Metro Chief of Police, we can expect a little over 110 in every 100,000 fare evaders to be violent criminals (or 1 in 1,000).
We can repeat this math to get the probability of a fare evader not being a criminal (or we could’ve simply subtracted the previous probability from 1):
Probability that a fare evader is not a violent criminal =
0.1224697 / 0.1226 = 0.9989375 (99,893.75 in 100,000)
What about a more conservative estimate?
An obvious aspect to this analysis that people may argue over is my choice of base-rate. There are many reasons why taking a % of the population living in the D.C. area based on a ridership poll could be biased (both upwards and downwards.) Just to drive my point home, I re-did this analysis, only including the estimated 25% of the population who report riding either sometimes, fairly, or very often (1,003,878 riders). Even limiting the with a quarter of the population (still only counting adults), the probability of a fare evader being a violent criminal is 0.00336 (336-in-100,000), which means the police would have to stop 997 people who won’t be committing violent crimes in order to get the information of 3 who will.
base-rates matter. a lot.
As a psychologist, my primary interest was to demonstrate the importance of considering base-rates. We showed how an impressive sounding statistic presented by a Metro Chief of Police (97% of people committing violent crimes on the Metro evade fares!) meant to convince you of a preferred policy can be extraordinarily misleading. The implied meaning of the (somewhat dubious) 97% statistic is that “we could catch more violent criminals if we only enforced fare evasion more strictly.” I mean, the vast majority of folks committing violent crimes on the Metro avoid fares, so cracking down on fares must logically lead to less violent crime, right? Well, now we know why this is wrong. A very small fraction of the D.C. metro ridership consists of people who commit violent crimes at the metro, so the vast majority of fare evaders are folks who don’t commit violent crimes.
So the next time you hear a sensational claim that tries to generalize about a large group by focusing only on a very small subset — e.g., “"most mass murderers are men," or "most terrorists are [insert whichever group is in the news most recently]“, think about what the underlying population looks like — the vast majority of the people in the world are not mass murderers (regardless of their gender) and the number of people in [insert group] who are not terrorist is so large, that you might mistake the ones who are terrorist for a rounding error.
Addendum
In my analysis above, I was pretty strict in terms of keeping in line with the specific statement by the MTA Police Chief. This means, I focused squarely on people who had committed violent crimes on the Metro. There are other arguments made for this policy that suggest stricter fare enforcement could catch other kinds of criminals due to the fact that an officer could do a background check to look for open warrants. I did check the active warrant list on the D.C. courts website and discovered a total of 2,679 unique names with active warrants. This is definitely larger than the 426 violent crimes committed on metros, so potentially worth exploring.
Before we jump right into saying that stricter fare evasion laws could contribute to catching the 2,679 folks with active warrants, it is worth considering what the law does and does not do. The law requires that individuals stopped for evading fares provide their names and addresses. It does not mean that an officer will opt to do a background check every single time they stop someone (I’d venture that maybe this would happen 20% of the time…) We also cannot know whether all of these people are riding the Metro (or even if they all still live in D.C.), though it is reasonable to assume that not all of them do (being generous, we could say that 80% take the Metro.) Just taking these two limitations into account drops us to ~428 potential criminals with warrants.
Finally, just because someone evades a fare doesn’t mean they will be caught. Statistics from the Metro Transit Police in 2017 (back when the fare evasion laws were stricter) show that police recorded 15,409 citations and arrests for fare evasions for the year. For context, data from 2023 (Jan. to Sept.) shows 10,490,732 evasions so far. That means the citation rate is somewhere south of 1 citation per 1,000 evasions. All of this is to say, that even with the argument that there are people with outstanding warrants who stricter fare enforcement could catch, it doesn’t seem likely that this would happen.
Appendix
Alternate approaches to metro ridership
We could take a look at statistics on metro ridership. The WMATA’s dataset (from Jan. 2023 to Sept. 2023), records 72,612,208 Metrorail rides for the year. There are several problems with this approach, as many people ride almost every day and ridership doesn’t count people — it counts entries into MetroRail, which means the same person could be counted multiple times in this data. Taking the average daily rides (313,426) has the opposite problem, as it assumes the same people ride every day. A third approach would be to look at Census data on commuters for work. Data from the 2022 Census show that of the 3,412,293 workers (16 and older), 6% of them (or 204,738) use a form of public transportation. This also likely underestimates metro riders, since it only folks on repeat users (though this is also counting all forms of public transit, not just the metro).