Here's how a high school or college debate tournament works: for the first two rounds of debating, each team is randomly assigned an opponent; for the third round, the winners of the first two rounds are assigned other winners as opponents, while losers debate losers. This system continues for several preliminary rounds, "power matching" teams against opponents with the same record of wins and losses, until the top "brackets" with winning records (7-0s and 6-1s, for example) move on to elimination rounds. Thus, the preliminary rounds are a type of Swiss system tournament, a format that is used in chess competition, too. The number of teams in each final bracket follows a perfect binomial distribution (plus or minus one team or two for odd numbers that require a team to be "pulled up" from a lower bracket).
This approach is generally felt to be fair, although it is a recognized problem that a good team could lose the first or second round and would have easier opponents all the way through. How often does this happen? A visualization helps:
Click on image for more detail.
These are the preliminary varsity policy debate results at the 2009 Harvard invitational high school tournament. (I took out names because I don't want to seem like I'm ragging on any school; I'm really just interested in the math.) Each row represents a different bracket -- the 7-0 at the top, 6-1s one row down, etc., and the 0-7 at the bottom -- and each row is sorted best speaker points (left) to worst speaker points (right) in that bracket. Each arrow represents one actual debate between two teams, pointing to the winner but in the loser's row color. Every single round is there, but I bolded the rounds that the top nine teams won. You can see how differently the top teams (the 7-0 and 6-1s) got that record. Some 6-1s, circled, defeated at least three 5-2 or better teams. Other 6-1s, in squares, defeated only one or no 5-2s. The 6-1 on the far left defeated not one team in the top 20%. Perhaps they could have, but they never even faced off against one. They made it into elimination rounds on the basis of an easier schedule than any other 6-1.
Let me make it absolutely clear, I'm not criticizing the folks who run the Harvard tournament. They do a fine job. The problem is not with their execution. I'm sure that at every point, the 6-1s were given proper opponents for their records; the problem is that some of those opponents went on to lose many of their remaining rounds and revealed their weakness. The problem is the method, which is only as good as the current record of each team accurately reflects its true strength. Since this information can't be known in advance, the only solution so far has been to repeat the process many, many times to thoroughly test and properly rank each team in preliminary rounds. Potentially, what you're looking at above is a raw sort that still contains some errors, like ABCEDLFGJIMNP... it's getting better, but there's still a need for further sorting. Consider it this way: the first round is supposed to determine whether a letter is in the first half of the alphabet or not, by picking up two letters at the same and determining which comes first. Generally speaking, this works, and A, B, C, etc., are likely to end up in the first-half pile. But what happens if the letters you pick up to compare are T and W? T will be misleadingly placed in the first-half pile, and you hope that this doesn't happen two, or three, or seven times in a row, but clearly, it can and did happen, and a team made into the top 6% without ever facing an opponent in the top 20%.
Randomness isn't enough. There needs to be an element added to power-matching that controls for strength of schedule. If you need further convincing, here are the 5-2s highlighted:
Click on image for more detail.
The circled 5-2s defeated at least one other 5-2. (It's hard to see those blue arrows, so click on the image for expansion first.) The 5-2s in squares defeated only one or two 4-3s or better -- that is, they made it into the top 20% and elimination rounds on the basis of defeating only one or two teams in the top 40%. That's quite a disparate schedule: debating other 5-2s and several 4-3s, or debating a few 4-3s and then several teams that are weaker.