Simpson’s Paradox

Berkeley, a small town twenty minutes east of San Francisco, is home to UC Berkeley, the oldest campus of the University of California. This prestigious university is famous for its excellent institution and has been the Alma Mater for more than seventy Nobel prize winners. In 1973, the university discovered something shocking about its admissions: 44 percent of all men that applied for a PhD position at UC Berkeley were accepted, whilst only 35 percent of women were accepted. This seemed to indicate gender discrimination in the selection process!

UC Berkeley would not be UC Berkeley if they did not strive to get to the bottom of this. Three scientists, led by Statistics professor Peter Bicket, set up their research and two years later, in 1975, their findings were published in the renowned scientific journal ‘Science’. The scientists claimed that the 1973 university admission rates were not necessarily an example of gender discrimination. They had found a striking example of a phenomenon that has come to be known as ‘Simpson’s Paradox’.

During their study, Bickel and his colleagues looked at nearly thirteen thousand applications for the in total 101 departments of UC Berkeley. Given the substantial size of this group, a priori, it seemed impossible that the huge difference between the admission rates of men and women was coincidental. In the most optimistic scenario, the admission rate of 44 percent for men and the admission rate of 35 percent for women would be due to prejudices, perhaps unconscious, but in the worst case, it would have been a matter of deliberate discrimination. If this was indeed the case at the university, it should be possible to identify the responsible party.

With this thought in mind, the scientists started looking for a scapegoat. Within UC Berkeley, students apply to various departments. An admission or rejection is therefore determined by the staff members of this specific department and not by the board of the university itself. The scientists therefore considered it worthwhile to take a closer look at these different departments. From here onwards, the story starts to get more and more complex. Amongst all 101 departments, the scientists found only four departments that had hired less women than one would expect based on fair, gender-independent admission rates. On the contrary, there were six departments that had in fact hired relatively fewer men. Thus, the data broken down by departments did not show any evidence of discrimination against women. As a matter of fact, it even seemed that men had a slight disadvantage when applying for a job.

Now, the question is how this phenomenon can arise? When looking at the two examples of the departments, you see that at one of these departments 933 people applied, of which 825 were men and 108 were women. Of the men, 512 were hired, which is about 62 percent, whilst 89 of the women, more than 82 percent, were hired. This implies that, in this case, the acceptance rate of women is significantly higher than the acceptance rate of men. In another department, 272 men and 241 women applied. Here only 21 men were hired, which is about 6% of the male applicants, as compared to 23 women, about 7% of the female applicants. At first, it seems that these departments were hiring more women than men. Nevertheless, if we combine the numbers of the two departments, we get a different picture. There were a total of 1,198 men applicants, of which 529 had been hired, amounting to 44 percent. On the other hand, there were 449 female applicants, but only 112 of them had been hired, i.e. a measly 25 percent. The departments combined seem to ‘discriminate’ against women, whilst each department independently seems to have a slight preference for women!


Sather Gate at the University of California, Berkeley


One of the drivers behind this striking result is that women were more likely to apply to departments where fewer applicants could be admitted. Men, on the other hand, sooner applied to the ‘less competitive’ departments. Looking into Berkeley’s admissions process, Bickel and his colleagues could not conclude that there was evidence for discrimination based on gender, partly due to the previously stated reason. They did conclude, however, that the difference in decision making of the male and female applicants was noteworthy.

Conclusions based on a large dataset can therefore be contradicted by conclusions from subgroups within the dataset. This phenomenon is known as Simpson’s Paradox, named after the British statistician Edward Simpson, who published an article in 1951 explaining the phenomenon. Today, there are numerous examples of this paradox, for example in studies in the medical field, sports statistics, and in research on quality of care and education. For instance, the New York Times reported in 2013 that average incomes in the United States had increased by one percent between the year 2000 and 2013. However, the journalists also noted that, at the same time, average salaries per level of education had actually fallen. Whether you were looking at college graduates, or people who hadn’t finished high school, each group experienced the same negative effects between 2000 and 2013. The key is that in that period of time more people went to university, and relatively less highly skilled people were unemployed, which meant that, overall, salary increased. This illustrates another example of Simpson’s Paradox!

Simpson’s Paradox is a warning that statistics can be treacherous. Statistics and data are important tools that can guide government policy, business strategies, or evaluations of professional athletes, however, statistics also lend themselves to misinterpretation. If we don’t handle data carefully, a study in the field of medicine may conclude that a drug works, whereas in each of the separate target groups it might not. Naïve conclusions from data can lead to a government investing millions in a project or a legislation that backfires. With Simpson’s Paradox in mind, data experts know they need to approach statistics with a critical eye.


You have just read a chapter of our upcoming book on AI stories: A must-read for anyone interested in the exciting field of artificial intelligence, regardless of how much you may or may not already know. Stay tuned for more exciting content from the book on all kinds of AI topics and applications, from artificial neural networks to self-driving cars to computer-generated art. And follow us on LinkedIn for more inspiring AI stories!


Click to follow GAIn on LinkedIn