Saturday, August 06, 2011

Voting & the Simpson paradox

Until relatively recently I did not know that there was something called the Simpson Paradox.

I mean that literally – I did not know that there was a name for something that seems to me, a mere descriptive statistician, a commonplace.

The Stanford Encyclopaedia of Philosophy introduces and summarises the problem thus:
An association between a pair of variables can consistently be inverted in each subpopulation of a population when the population is partitioned. For example, a medical treatment can be associated with a higher recovery rate for treated patients compared with the recovery rate for untreated patients; yet, treated male patients and treated female patients can each have lower recovery rates when compared with untreated male patients and untreated female patients. The arithmetical structures that underlie facts like these invalidate a cluster of arguments that many people, at least initially, take to be intuitively valid. E.g., despite intuitions to the contrary, the following argument is invalid.

The probability of male patients recovering following treatment is greater than the probability of their recovering following no treatment.

The probability of female patients recovering following treatment is greater than the probability of their recovering following no treatment.

Therefore, the probability of (male and female) patients recovering following treatment is greater than the probability of their recovering following no treatment.

Further, the arithmetical structures that invalidate such arguments pose deep problems for inferences from statistical regularities to conclusions about causal relations.

In thinking about examples I have come across, it occurred to me to wonder if this is not, on part at least, what the argument over fair voting is all about? For a first-past-the-post constituency based system can, even with only two parties, produce a winner with a minority of the overall national vote, which is certainly a reversal.

The paradox or perplexity (or unfairness) arises in part because of the variation in the distribution of voters across constituencies – which is why we have embarked upon a major exercise to draw boundaries which attempt to ensure that each constituency will carry the same weight in the next national election.

Unfortunately there is nothing much that we can do to ensure an even geographical distribution of party preferences. So we may be even more surprised & perplexed by the outcome