Mathematics that will not save democracy

On the probability of the Condorcet Jury Theorem or the Miracle of Aggregation – Álvaro Romaniega

Pincha aquí para la versión en español.

Cited by:
El País – Matemáticas para entender y mejorar la democracia.

Summary in English

In 1785, Condorcet published a result showing how voting could be useful for efficiently aggregating the private information of a group of agents. This theorem applies when we face a dichotomous choice between A and B , where there is a correct1 option, say A . For instance, a jury deciding whether a defendant is guilty or innocent, which has a correct option. It is assumed that each agent is more competent than flipping a coin and their choices are independent of each other. In this scenario, Condorcet showed that if votes are aggregated using the simple majority rule (A wins if it has more votes than B ), the probability of choosing the correct option tends to one as the number of voters approaches infinity. Hence, we can aggregate information efficiently: if voters are slightly competent, we can produce a (nearly) perfect decision-making procedure, that is, the probability of being correct is as close to one as we want if there are enough voters. We move from being slightly better than random to near-perfect competence. This is the famous Condorcet Jury Theorem (CJT).

Condorcet assumed that voters are homogeneous, i.e., all voters have the same competence, but this is a very unrealistic hypothesis. We can generally consider that each voter has a probability between zero and one of choosing the correct option. More recent analyses have demonstrated what conditions (necessary and sufficient) these probabilities must meet for the CJT thesis to hold. We can now ask ourselves, what is the probability that the voters’ competences guarantee the application of the CJT? 

This question is answered in the paper. The answer is of great interest, as the theorem is often cited in discussions about aggregation processes and is a very powerful result if applied. For example, it could be a response to the objection of rational ignorance among voters.

First, we must clarify what we mean by probability. From a Bayesian perspective, we distinguish between prior probability and posterior probability. Our work focuses on prior probability, which is the probability before having collected evidence about the competence of voters. The connection between prior and posterior probability is given by Bayes’ theorem. That is, the prior probability is the starting point from which we calculate the final probability after obtaining evidence. 

Unfortunately, the conclusion is that the prior probability of the theorem’s thesis is zero, and this result is robust for an infinite family of prior probabilities. That is, if we try to measure the applicability of the CJT according to a symmetrically balanced distribution (in particular, without bias towards incompetence), we find that the theorem almost certainly does not hold. Therefore, in this context, we would need strong evidence of voter competence to expect the CJT thesis to be applicable.

Representation of the theorem in the article. Given random competences, understood as probabilities of voting for the correct option, \mathbb{P}\left(X=1\right), only in exceptional cases will the Condorcet theorem thesis occur (when aggregating votes, we would behave as a person with a probability of 1 of voting for the correct option). That is, we should be very “lucky” to have a distribution of probabilities that implies the theorem. Animations created with the Manim package of Python. Note: set to HD quality for video playback.

The conclusions can vary. It’s possible that we do have evidence that voters are competent, which would counteract the low prior probability, such as the case of a jury where its members have been informed for months and possess the skills to reach a “fair” verdict. It might be necessary to invest more in education to ensure competence. However, another avenue explored in the literature is modifying the aggregation process, particularly moving away from simple majority to weighted majority, where the value of each vote is not the same. The idea is to give more weight to those with a higher probability of being right. In the 1980s, several results were established indicating how these weights should be.

Some insights from these results were revealed by Nobel Prize winner in Economics Lloyd Shapely and Bernard Grofman. Suppose we have voters with competences (0.9, 0.9, 0.6, 0.6, 0.6). We can consider various aggregation rules:

  • Under the expert rule (taking a single voter, the one with the highest probability of being right), the probability is 0.9.
  • Under the simple majority rule, the probability is  \approx 0.877. This  improves the average competence, as the CJT already indicated, but is below the expert rule.
  • Under the weighted majority rule (with weights proportional to \log(p/1-p), where p is the probability of being right), the probability is \approx 0.927, improving upon the previous results.

This result may be counterintuitive, as we are assigning non-zero weights to the less competent but, nevertheless, improving the total probability compared to the case of the expert rule. We can better understand this if we observe that these less competent members can break a tie if the two most competent individuals do not agree. The aforementioned authors conclude with the importance of these types of weighted majority theorems for the aggregation processes of votes in various fields:

While the results of this essay seem particularly appropriate to analysis of the problem of ‘information pooling’, in which the task is to weigh the advice of ‘experts’ or to reconcile ‘expert’ and ‘non-expert’ conflicting opinion; we believe Theorem II [optimal weights] to be of considerable general importance for democratic theory.

Following this direction, in our work, we also consider the CJT for a weighted majority rule and how likely the thesis of the CJT is when these weights are included. However, we explore a different type of weight: they will be strictly positive (greater than one), bounded above, and subject to some stochastic error (reflecting a measurement or allocation error). It is important to note that the previously mentioned results consider weights that violate all the properties we have just listed. While this is mathematically optimal, it can be problematic in real-life situations. For example, negative weights are equivalent to reversing the vote result. We choose to investigate the case of weights such that all votes count for several reasons. One is that it could be objected that in some circumstances not allowing (or reversing) certain votes can express a lack of respect, that is, a semiotic objection based on the expressive value of the voting process. In the context where each potential voter is guaranteed a minimum weight, these objections are less motivated. In fact, votes are differently weighted in many current processes, although they are usually not weighted by competence but by other factors.

Under these conditions, we show that with appropriate weights we can make the probability of the theorem’s application one, and this for very general distributions of voter competence, including very unfavorable ones. The main ingredient of the theorem is having a (not perfect) correlation between the weights and competence. We take this task seriously, and the work moves from mathematical economics to behavioral economics where we show how these weights could be assigned (more details on the basis of behavioral economics and psychometrics were considered here).

If we now establish weights for the weighted majority rule and these weights are correlated with epistemic rationality, i.e., the probability of voting the correct option, \mathbb{P}\left(X=1\right), we will find that the thesis of the Condorcet theorem is “almost” certain. Animations made with the Manim package from Python. Note: set to HD quality for video playback.

In conclusion, it is shown that the CJT is highly improbable unless we have strong evidence about the competence of the voters or include some epistemic weights in the aggregation process (weights related to epistemic rationality). The CJT is an important result and very useful for improving the decision-making process, but we must ensure that it can be applied when it is supposed to. This work attempts to open a line of research in this direction. 

Footnotes

1 Notice that some topics are more prone than others to be solved as epistemic rationality increases. For instance, discussing the means to achieve an agreed end can be easier than discussing the ends we should pursue.

Complete Article

  • Article published in Mathematical Social Sciences, link.
  • Free accessible version through arXiv, link.

One thought on “Mathematics that will not save democracy

Leave a comment