Wednesday, January 20, 2010

Differing opponent strengths

Below is a graphic of a traditionally-run tournament.

The horizontal axis shows each team's final strength; the vertical axis shows each team's average opponent strength; the size of the bubble shows, for each team, the standard deviation of its opponents' strengths. A small bubble represents a team that debated opponents that were all very close together in strength. A large bubble represents a team that debated a wide cross-section of opponents, some weak and some strong.

I think about how a debate tournament ought to look, if it's paired fairly. It seems to me that every team ought to have a good cross-section of opponents. Thus, a fair tournament would be like a partial round robin. We would know that 3-3 teams were truly middle-of-the-pack because of their abilities, not because they got an unfair draw. The bubbles in the diagram would be bigger (each team sees a true cross-sections of opponents) and closer to the horizontal line (average opponent strength for each team would be closer to the overall average opponent strength).

It's relatively easy to pair a tournament like this, even on the fly. To pair a round, you can look at each team's opponents and decide what is missing so far. After three rounds, a team might have debated a 0-3, a 2-1, and a 3-0 opponent; they would now debate a 1-2 opponent. It is true that the opponent records change after the fourth round, but the process is repeated, and by the end, most teams will debate a decent cross-section of opponents from 0-6 to 6-0. Of course, traditional tournaments do not do this; teams debate opponents within brackets. Why?

The reason is that brackets increase the accuracy of rankings. Consider a 4-2 team. Does it deserve to break? If the tournament pairs it against a representative cross-section, this team would debate a 6-0 opponent, a 5-1, a 4-2, a 3-3, etc. There's only one opponent with an equal record -- but it's precisely the comparisons to very similarly-abled opponents that shed the most accurate information about a team's true strength. In a brackets system, the same team would likely debate several 4-2 opponents. There are more points of comparison, allowing for finer rankings. The downside, though, is that a team could go through the preliminary rounds of the tournament debating opponents that are all at exactly the same level. It seems to me like something valuable would be lost.

Of course, these two virtues -- fairness and accuracy -- trade off. You can't maximize both. But there are several ways to get a reasonable equilibrium. For example, pair odd rounds to have every team debate a reasonable cross-section of opponents (i.e., across brackets), and pair even rounds to increase accuracy of rankings (i.e., within brackets).

No comments:

Post a Comment