Investing oftentimes involves closely examining numbers. Investors rely on data, and trends in that data, to provide insights into what's happening with their investments. Numbers can sometimes look odd to analysts too; which is certainly true with Simpson's Paradox.
In this article, we're going to discuss the topic of Simpson's Paradox. As part of that explanation, we'll provide a brief history of the term as well as a summary definition. Then we'll finish this topic with some examples, illustrating how this paradox can apply to investment portfolios.
Simpson's Paradox
In the study of probability and statistics, Simpson's paradox is defined as the seemingly contradictory result that occurs when improvements in all subpopulations occur, yet when these subpopulations are combined, the improvement is lost.
History
In the paper "The Interpretation of Interaction in Contingency Tables," published in the Journal of the Royal Statistical Society back in 1951, Edward Hugh Simpson explained the phenomenon whereby an event that would increase the occurrence of a condition in a given population could, at the same time, decrease the occurrence of that same condition in every subpopulation.
The importance of this paper to all analysts is simple: Don't make assumptions about data; take care when interpreting numbers. Let's see how this works with some practical examples.
Finding Average Values
Every analyst knows that when examining a population, it's possible to calculate average values for each segment of the population. They also know it's incorrect to take an average of those average values to determine the average for the population. This point can be demonstrated with a hypothetical example consisting of a three-stock portfolio:
Average Value Example
Investment | Starting Value | Ending Value | Increase |
Stock A | 100 | 110 | 10.0% |
Stock B | 200 | 240 | 20.0% |
Stock C | 300 | 390 | 30.0% |
Totals | 600 | 740 | 23.3% |
By taking a simple average of the increase for the above three stocks, the analyst might incorrectly conclude the overall portfolio increase was 20%. The total row demonstrates the correct value is actually 23.3%. Anyone that's made this mistake in the past knows the rule is "you cannot take an average of an average."
Another foolproof solution is to find the weighted average of each segment and add them together as shown in this second example:
Weighted Value Example
Investment | Starting Value | Ending Value | Increase | Weighted Value |
Stock A | 100 | 110 | 10.0% | 1.7% |
Stock B | 200 | 240 | 20.0% | 6.7% |
Stock C | 300 | 390 | 30.0% | 15.0% |
Totals | 600 | 740 | 23.3% | 23.3% |
The weighted value is found by taking each starting value and dividing it by the total of all starting values, then multiplying that number times the increase. For Stock A, the calculation is:
Stock A Weighted Value = (100 / 600) x 10.0% = 1.7%
Is it possible for every stock in a portfolio to increase its year-over-year return and the overall value of the portfolio to decrease? The answer is yes, especially if someone is actively trading stocks.
Investment Portfolios
A very simple stock portfolio example is able to demonstrate Simpson's paradox. In this case, there is a hypothetical portfolio consisting of three stocks, and trading is limited to exchanges between these stocks. The total number of shares held at the start and end of this timeline will be exactly the same (3,000). Finally, the ending value of each stock will be exactly 10% higher than its starting value. Unfortunately, the ending value of the portfolio is exactly 10% lower than the starting value, as demonstrated in the example below.
Simpson's Paradox: Stock Example
Stock A | Stock B | Stock C | Totals | |
Starting Stock Value | $10.00 | $20.00 | $30.00 | |
Shares Held | 1,000 | 1,000 | 1,000 | 3,000 |
Starting Value | $10,000 | $20,000 | $30,000 | $60,000 |
Stock Ending Value | $12.00 | $24.00 | $36.00 | |
Shares Held | 2,000 | 500 | 500 | 3,000 |
Ending Value | $24,000 | $12,000 | $18,000 | $54,000 |
For the above to be true, there were obvious dips in the value of stocks when the majority of the trades occurred. Still, this example makes the point, and this scenario can, and does, occur all the time. During bear markets, many investors try to time the market and wind up selling low, only to re-enter the market after prices have risen. This is the classic mistake of "buy high / sell low."
A more commonly cited example of Simpson's paradox has to do with unemployment rates. This example involves breaking a population into subgroups based on their level of education. Over a five-year timeframe, each segment experiences an increase in unemployment, yet the unemployment rate of the entire population goes down. This second example illustrates how this can happen.
Simpson's Paradox: Unemployment Example
No High School | High School | College Degree | Totals | |
Initial Counts | 10,000,000 | 10,000,000 | 10,000,000 | 30,000,000 |
Unemployment Rate | 8.0% | 6.0% | 4.0% | 6.0% |
Unemployment Count | 800,000 | 600,000 | 400,000 | 1,800,000 |
Final Counts | 5,000,000 | 5,000,000 | 20,000,000 | 30,000,000 |
Unemployment Rate | 8.8% | 6.6% | 4.4% | 5.5% |
Unemployment Count | 440,000 | 330,000 | 880,000 | 1,650,000 |
In this second example, the unemployment rate for each subgroup increased by 10%, yet the overall rate fell from 6.0% to 5.5%. These two examples not only serve to illustrate Simpson's paradox, but also the importance of examining data with care.
About the Author - Simpson's Paradox and Investing