Sunday, April 26, 2009

Standard debate tournament calculations

I thought it might be useful to post a list of standard probability/counting methods for debate tournaments. I list them in order of rapidity of growth -- the last listed expands most quickly, but they all increase without bound. For all of these, there a n teams at the tournament, and n is an even number.

First: once the teams have been separated into aff and neg pools, how many possible matches are there to consider?


Second: how many possible matches are there at a tournament, period? If you don't consider sides, then there are


possible matches. If you did consider sides (A aff vs. B is counted as different than B aff vs. A), then are twice as many possible matches.

Third: how many ways are there to separate the teams into aff and neg pools?


Fourth: once the teams have been separated into aff and neg pools, how many complete (perfect) pairings are possible?


This is the fastest growing number of all.

Friday, April 24, 2009

Pascal's triangle: a neat proof

There is an interesting question about how the terms in Pascal's triangle grow. All the terms in a row obviously grow (except the 1s at the extreme left- and right-hand sides of the triangle), but the rows' totals obviously grow, too. Do any of the terms in a row converge, as a percentage of the total of the row?

The most interesting example is the middle term in a row, e.g.:

1
1 1
1 2 1

the 1 and 2 are middle terms for their respective rows. The middle term is always the largest term and grows quickly. So, which grows faster: the row total or its middle term, or is there a convergence? A few examples illustrate what happens:

The 0th row -- 100%
The 2nd row -- 50%

The numerator is the formula for the middle term (see the binomial distribution); the denominator is the row's total. The largest row that my TI-84 calculator could handle before overflowing is the 332nd. As the examples show, the denominator is "winning" -- the middle term as a percentage of the total is steadily decreasing. Pretty amazing considering just how big the middle term is; for the 332nd row, the numerator is 3.8E98!

Can we write a proof of this? Certainly. The first step is to re-write the numerator. Given that the combination is really a simplification of


so, our specific numerator can be re-written as:


To re-write the limit for our whole expression, the middle term divided by the row total, would look like this:

Now the clever step. Let's re-organize that n factorial in the numerator and expand the terms in the denominator:

All the n factorial terms are there; they just been re-organized, with the even terms on one side and the odd terms on the other. It should make sense that the terms in the numerator "sit" perfectly in the "seats" provided by the denominator: there are n terms in the numerator, and there are two groups of n/2 terms in the denominator.

The proof is basically done. One simple re-write makes the final conclusion clear:

Each fraction is now on its own. The first set of fractions are all 1s. Every other fraction must be less than 1. An infinite multiplication of fractions less than or equal to 1 must converge to 0. So, in an infinite Pascal's triangle, the middle term, although it increases without bound, is 0% of the row's total.

---

The above paragraph is incorrect. A lot more work must be done to show that the product of the remaining fractions


goes to zero. I like explanations that just make intuitive sense. So, the first term is 0.5. I noticed that the product of the next four terms (terms two to five) is slightly less than 0.5. A little cancelation helps make why clear:


Thus:


And this continues. The product of the next sixteen terms (terms six to twenty-one) is also less than 0.5. Here are the terms:


A litte cancelation starts the process:


and reorganizing the denominators makes the result clear:


Each pair of terms is slightly less than one. Each pair of terms can be written in the form:


or


The product of the next 64 terms is also less than 0.5. (The number of terms is based on powers of 4.) So, the whole product can be rewritten as 0.5^infinity... that definitely goes to zero.

Tuesday, April 21, 2009

Pascal's triangle and brackets

As many debate coaches and competitors already know, because teams are always power-matching within brackets against opponents with the same record, it is possible to construct a simple branching diagram to predict how many teams are in each bracket, like so:

100
50 50
25 50 25

and so on. What they may not realize is the connection to Pascal's triangle, a famous problem in mathematics:

Here are the first nine rows. Each number in a row is created by adding the two numbers in the row above, the number diagonally left and the number diagonally right. The numbers in each row can easily be turned into percentages by dividing a number by the total of the row. For example, in row one, 1/1 is 100%; in row two, 1/2 is 50%; in row three, 1/4 is 25%. As you might have noticed, the total for each row is always 2^(n-1), where n is the row number. (There's also a shortcut for finding the numbers in the rows.)

Once everything has been converted into a percentage, it can be organized in a more easy-to-read chart:

Of course, some of these percentages are overly precise for a debate tournament. The percentages never quite work out exactly because of pull ups caused by uneven numbered brackets, side constraints, team constraints, etc. I always use the exact percentage to multiply by the number of teams at the tournament -- but I am never surprised by +/- 1 or even 2 teams off the prediction. Still, it can be a simple way to estimate the number of teams in each bracket, especially at a very large tournament where the constraints will have less effect. I've looked at the data from large tournaments before and found that the bracket size predictions were within a percentage point. That estimation is useful to estimate the number of teams that could break:


The left two columns show the percentages of teams with a winning record (excluding those with a 50-50% win-loss record) after various even numbers of rounds. The middle columns show the percentage of teams with winning records after various odd numbers of rounds -- always 50%. The right-most column shows the percentages of teams with winning records excluding the worst winning bracket (i.e., excluding 2-1s after three rounds, 3-2s after five rounds, and 4-3s after seven rounds).

Saturday, April 18, 2009

Partial elimination round calculator

I'm a big believer that every team at a debate tournament with a winning record should make it to elimination rounds. (Or, if this is simply impossible, break every team with a certain record, e.g. all 5-2s and no 4-3s. That only seems fair; breaking some 5-2s on speaker points, but not other 5-2s, places too much trust in the objectivity of speaker points.)

Of course, the problem is that breaking every team in a certain bracket creates the need to run a partial elimination round (because how often will a tournament need to break exactly 2^n teams?). The logistics of a partial elimination round can seem a bit unpredictable at first -- how many teams will need to be in the partial? how many judges will be needed? -- so I created a "partials calculator" to help estimate answers to these questions: http://www.mediafire.com/?tmtrnmqtnzu

A few notes about the calculator. I set it up with a default of 34% of teams breaking, a good assumption for the percentage of 4-2s or better, but this value can be changed in the green box. I also set up a default of one judge per two teams in the single-flighted event and one judge per three teams in the double-flighted event, although these values can also be changed in their green boxes.

The yellow highlighted boxes show when a tournament will need extra elim judges. For example, in the single-flighted event, there are very few situations where the tournament would need extra elim judges -- the most significant troubles are in the 100s-110s, where the tournament might be short 10 judges. Otherwise, there are few problems.

The double-flighted event shows two options: (a) single-flight the partial and double-flight the first elim round, or (b) double-flight the partial and single-flight the first full elim. Requiring one judge for three teams is a crucial value; there will enough judges in almost every case to single-flight one (or both) of these two rounds.

Wednesday, April 8, 2009

Side assignment algorithm found!

One of the most vexing problems in running a small tournament is assigning sides in odd rounds. If you assign the sides incorrectly in an odd round, there may be nothing but bad options left for the subsequent even round: break side constraints (a team is aff. rounds 3 and 4), break school constraints (force two teams from the same school to debate), or re-match two teams (two teams that have already met have to debate again).

For a round robin, there are very few possible solution in the third to last round. Say you have 8 teams, so before round 5 is paired, team A has three opponents remaining: B, C, and D. If A is assigned to the aff, then 2/3 of {B, C, and D} must be assigned to the neg side -- otherwise, it will be impossible to honor side constraints for rounds 5 and 6. This sounds easy until you realize that B, C, and D all have three opponents each remaining, 2/3 of whom must be assigned to the opposite side... There ends up being just a few ways to pair round 5 that are consistent with all side constraints and re-match constraints for round 6 too.

The complexity of this problem has lead tab programs to include a "schedule round robin" option that executes before any rounds have happened at all. Although round robins are the epitome of this problem, many small tournaments with heavy school constraints (many teams from the same school) also experience it, leading to late round pairings that violate many constraints -- and there's no option yet programmed that solves that. Even big tournaments suffer from a similar problem. Although there's no risk of a big tournament violating any constraints, there is always the question of how well mixed a tournament is. If the same group of teams always is aff in odd rounds {A, B, C, D, E...}, then there's no chance that A can debate B, B debate C, etc. The tournament is better mixed if round 1 affs are {A, B, C, D, E...} and round 3 affs are {A, C, E...} and so on. The current assumption is that randomness is sufficient to mix a tournament well, which might be true, but it's nice to have an algorithm in hand that guarantees that.

Now, you may be tempted to think that a computer program can solve this sequentially: pair round 3, then have it look ahead to round 4 and see if there are any problems. This may just work at a tiny tournament, but the number of possibilities becomes daunting quickly, before becoming downright impossible, even for a supercomputer (and I don't believe many tab rooms would want to book time on the Roadrunner supercomputer). If you set a pairing for round 3, then look at all the possible round 4 options that follow, there are (n/2)! options, where n = number of teams. For 40 teams, there are 2.43 E 18 options. The world's fastest computer, by my rough estimation, would take 37 minutes to verify that all the options for round 4 were bad. To put it in perspective, 2.43 E 18 is about how many stars we can see in the sky. There is a better way, and it's to do a clever simultaneous approach.

Here's the gist of what I found. The mixing issue and the constraints issue are actually two sides of the same coin. If in round 1 team A aff. defeats team B neg., and in round 2 team B aff. defeats C neg., then the computer should be able to reject several possible side assignments out of hand. The best mixed, least conflicted option is actually to put {A, B, C} on the aff. for round 3. A and B can't debate again, so they might as well be on the same side; ditto for B and C. The only possible round that's lost is A vs. C, but any other way to assign the sides rules out more possible rounds. A smart use of this fact enables the algorithm to quickly toss out many possible options and reduce it to a manageable number.

Does the algorithm actually work? Yes. I tested it on a regular laptop, and it will pair a round robin on the fly. Without giving the program any prior knowledge that a tournament will be a round robin, it just picks the best round each time in a few seconds -- and pairs the round robin correctly and completely. The algorithm itself is not complex, but I'll save it for this document:

Thursday, April 2, 2009

Neat proofs with tangents and slopes

Spring break is over, and it's time to do some math again!

Let's say you have two lines, y1 and y2, with unknown slopes. However, you do know that y1 is perpendicular to y2, like so:

Without invoking the triangle sum, how can you prove two marked angles are complementary? With a simple use of the co-function.

If angles a and b are complementary, then a + b = 90, and 90 - a = b.

The co-function of tangent is co-tangent: cot (90 - a) = tan (a)*, so cot (b) = tan (a) if a and b are complementary. Is that true? To see, a few more labels help.



Cot (b) is just x2/h, and tan (a) is just h/x1. So, is x2/h = h/x1? You bet -- the slope of lines are (negative) reciprocals of each other.

* Proving co-functions is simple. Starting with the unit circle:


The coordinates of (x, y) = (cos (a), sin (a)). However, if you begin at (1, 0) and move clockwise (therefore, an angle of 90 - a), the points are a mirror image reflected over the y = x line of those before, and therefore (y, x) = (cos(90 - a), sin(90 - a). This defines the basic co-functions: x = cos(a) = sin(90 - a), y = sin(a) = cos(90 - a). And it follows that cot(90 - a) = cos(90 - a)/sin(90 - a) = sin(a)/cos(a) = tan(a).