Tuesday, August 9, 2011

A modest proposal to ensure geographic mixing at Nationals

I wrote an article (visible below), published in the inaugural edition of the National Journal of Speech and Debate, about using geographic and strength criteria to mix teams in preliminary rounds at N.F.L. Nationals. In other words, I advocate the use of a system that ensures that each team will debate a broad cross-section of different opponents, from different parts of the country and at different skill/experience levels, as measured by N.F.L. debate points. I won't repeat the arguments here about why I think this is a worthwhile goal, except for this one thought: Geographic mixing makes it likely that the less experienced teams -- who probably have not travelled far afield -- will debate opponents at Nationals they have never seen before. Even with geographic mixing, there is still a chance that national circuit teams might face an opponent in preliminary rounds at Nationals they have debated many times during the invitational season. That is why there is also a need for skill/experience level mixing. Both are necessary to make it likely every team will see "new" opponents at Nationals.

I will focus on two technical concerns about my proposal in this blog post.

Concern 1: Can two criteria really be maximized at the same time?

Yes and no. In a strict sense, no: only one variable can truly be maximized at a time. That is to say, you can have a round where the average geographic distance between opponents is maximized, or you can have a round where the average difference of skill/experience between opponents is maximized, but you can not have both at the same time. However, in a looser, more practical sense, the answer is yes: you can have a round where opponents are well-mixed geographically (even though not maximally mixed) AND well-mixed skill/experience-wise (even though not maximally mixed). Let me show you with some sample data.

Here are 26 fictitious teams, spread throughout the country in geographic clusters, and at different skill/experience levels (normally distributed from 0 to 1 in my sample data). I imagined that the N.F.L. points could be scaled so the weakest team to qualify was given a rating of 0 and the most experienced a 1, but they do not need to be scaled at all for this method to work. The median distance between every possible pairing in the whole set is 2138 miles. The median difference in experience is 0.28 units.

If one tries to maximize geographic spread, the round 1 pairings that are selected would look like this:

The median distance between two opponents in each pairing is 3333 miles, and the shortest distance is 2452 miles. In other words, every match chosen is above the average of 2138 miles. This is maximized; it is a Pareto optimal solution, meaning that any change to improve a pairing by swapping opponents would have to make another pairing worse. The net result can not be improved. In the first round, teams in the middle of the country would debate coastal teams. As the tournament proceeds, each team would get opponents from every geographic region of the country.

If one tries to maximize the differences of skill/experience levels, the round 1 pairing would look like this:

The median difference between opponents is 0.45 units, and the least difference is 0.42 units. Every match chosen is above the average of 0.28 units. Again, this is a Pareto optimal solution. In the first round, mid-level teams would debate either inexperienced or highly experienced teams. No inexperienced team would be matched against a highly experienced team -- this would force two mid-level teams to debate. However, in further rounds, each team would get opponents at every different level.

What happens if you try to maximize both? The resulting pairings would look like this:

The median distance is 3092 miles, and the shortest distance is 2003 miles (with only 15% of matches below the average of 2138 miles). The median difference is 0.45 units, and the least difference is 0.23 units (with 23% of matches below the average of 0.28 units). Although these pairings do not maximally mix for geography, they do pretty well. And likewise for difference in skill/experience level. This represents a lower boundary of how well this method could work. If we use a larger data set, such as the 200+ teams at Nationals, then it becomes easier to find pairings that maximize both criteria.

Concern 2: Would this method create the same pairing year after year?

It seems like it might: an optimal solution for one year seems like it might be the same, or very similar, the next year. No one wants to see the same opponent in preliminary rounds two (or more) years in a row at Nationals.

However, this kind of optimization is chaotic, meaning it is extremely sensitive to small changes.

No comments:

Post a Comment